Project/Area Number |
19K20354
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
Wang Rui 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的翻訳技術研究室, 研究員 (00837635)
|
Project Period (FY) |
2019-04-01 – 2021-03-31
|
Project Status |
Discontinued (Fiscal Year 2020)
|
Budget Amount *help |
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2021: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2020: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2019: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
|
Keywords | Machine Translation / Unsupervised Learning / NLP / machine translation / AI / unsupervised learning |
Outline of Research at the Start |
I will conduct unsupervised neural machine translation (NMT) in universal scenarios. 1. In the scenario of rich-resource languages, the abundant monolingual corpora are available. I will generate high-quality pseudo parallel data by back-translation. 2. In the scenario of low-resource languages, the monolingual corpora are rare. I will conduct high-quality bilingual word embedding for robust UNMT. 3. In the real-scenario translation, the domain of user query (test data) is difficult to predict and is sometimes different from the training data. I will adapt the UNMT according to user queries.
|
Outline of Annual Research Achievements |
I have proposed a universal unsupervised approach which train the translation model without using any parallel data. Compared with the existing unsupervised neural machine translation (UNMT) methods, which has only been applied to similar or rich-resource language pairs, my methods can be adapted to universal scenarios. I have published more than 20 peer-reviewed research papers (I am the corresponding authors of most of these papers). Most of these papers are published in the top-tier conferences and journal. Such as 7 ACL, 1 EMNLP, 2 AAAI, 2 ICLR, and 3 IEEE/ACM transactions. I won several first places in top-tier MT/NLP shared tasks, such as WMT-2019, WMT-2020, CoNLL-2019, etc.
|