2018 Fiscal Year Final Research Report

Zero-shot machine translation using multimodal deep encoder-decoder networks

Research Project

PDF

Project/Area Number	16H05872
Research Category	Grant-in-Aid for Young Scientists (A)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	The University of Tokyo
Principal Investigator	Nakayama Hideki 東京大学, 大学院情報理工学系研究科, 准教授 (00643305)
Project Period (FY)	2016-04-01 – 2019-03-31
Keywords	機械翻訳 / ゼロショット学習 / マルチモーダル / 画像認識 / ニューラルネットワーク / 表現学習
Outline of Final Research Achievements	In this research, we have developed a zero-shot machine translation method which can be trained only with monolingual image-text data, without the help of parallel text corpus. This method is realized by the idea of using images as a hub to align texts in different languages. Moreover, we have improved the method in many aspects to enhance its practicality such as output diversification and speeding up. These results are accepted at many top-level international conferences such as ACL and ICLR, and awarded the best paper awards twice at the NLP domestic conference.
Free Research Field	画像認識、自然言語処理
Academic Significance and Societal Importance of the Research Achievements	機械翻訳はより一層の技術革新が強く求められているアプリケーションであるが、現在の一般的なアプローチにおいては、学習に用いる対訳テキストコーパスの量が性能向上の鍵となる。しかしながら、実際には同一内容を複数言語で記述したテキストドキュメントは少なく、GAFA等一部の巨大企業にデータを独占されているのが現状である。本研究で提案するアプローチでは、誰でも比較的容易に入手可能な画像付き単一言語ドキュメントのみからの学習を実現するものであり、学術的にも独創的な試みであると同時に、機械翻訳の民主化に貢献しうる点で社会的意義も大きいものであると考える。