2019 Fiscal Year Research-status Report
Unsupervised Neural Machine Translation in Universal Scenarios
Project/Area Number |
19K20354
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
Wang Rui 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的翻訳技術研究室, 研究員 (00837635)
|
Project Period (FY) |
2019-04-01 – 2022-03-31
|
Keywords | Machine Translation / Unsupervised Learning / NLP |
Outline of Annual Research Achievements |
I have proposed a universal unsupervised approach which train the translation model without using any parallel data. Compared with the existing unsupervised neural machine translation (UNMT) methods, which has only been applied to similar or rich-resource language pairs, my methods can be adapted to universal scenarios. I have published 26 peer-reviewed research papers (I am the corresponding authors of most of these papers). Most of these papers are published in the top-tier conferences and journal. Such as 7 ACL (4 in ACL-2019 and 3 in ACL-2020), 1 EMNLP, 2 AAAI, 2 ICLR, and 3 IEEE/ACM transactions.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
My research is signiciantly ahead the initial proposal: 1) There are approximately 47 papers in the MT track of ACL-2019 (ACL is the best conference in NLP, short for natural language processing). I am one of the most productive authors in the MT track. In addition, according to the statistic of AceMap, I am the most productive author of Japan in the NLP field of 2019. 2) Our paper ``Data-dependent Gaussian Prior Objective for Language Generation" received full review score in ICLR-2020 (ICLR is one of the best conferences in machine learning) and ranks 1st among more than 2500+ submissions (there is no best paper award due to COVID-2019). 3) In WMT-2019 (Most famous shared task in MT, Corresponding author), I won the 1st in the only unsupervised MT task (German--Czech) by BLEU and human evaluation.
|
Strategy for Future Research Activity |
In the future, I will conduct a more challenging topic: the interpretability of UNMT. As the development of UNMT, most of the previous works focus on improving the performance by enhancing the structure of neural network. However, similar as cognitive science for human brain, how to interpret the effect of neural network is also an important topic. I will mainly focus on three aspects: 1) Interpret the effect of model: Neural network model seems to be a black box to human. That is, it is difficult to understand what is happen in UNMT and the effect of each component in UNMT. I will conduct research on interpreting the effect of neural network model. 2) Interpret the effect of data: To UNMT, the data is as the input and output of models. Knowledge, which is annotated in the data, is transferred from data to model. However, few works have considered the effect of the knowledge transferring. I will conduct research on interpreting the procedure of knowledge transferring via data. 3) Multi-signal UNMT: Theoretically, machine translation only considers text information. However, in human translation, human also take the speech, image, etc, into consideration. Motivated by this, multi-signal or multi-domain based neural machine translation is a way to simulate the human intelligence.
|
Causes of Carryover |
For travel fee, due to the COVID-19, I did not use the budget to attend conferences, etc. Therefore, I will use the budget left to attend some conferences in FY2020. For Article and other fee, due to the COVID-19, some orders are postponed to FY2020. These orders will be resumed after the COVID-19 ends in FY2020.
|