2017 Fiscal Year Annual Research Report

Zero-shot machine translation using multimodal deep encoder-decoder networks

Research Project

Project/Area Number	16H05872
Research Institution	The University of Tokyo
Principal Investigator	中山英樹東京大学, 大学院情報理工学系研究科, 講師 (00643305)
Project Period (FY)	2016-04-01 – 2019-03-31
Keywords	機械翻訳 / 教師なし学習 / 自然言語処理 / 深層学習 / データ圧縮
Outline of Annual Research Achievements	提案するゼロショット機械翻訳を実現する基本的な手法は既に前年度に完成させているが、本年度はまずその成果の取りまとめおよびジャーナル論文投稿を行い、Machine Translation Journalへ採択された。これに加え、本年度の新たな目標とした各要素技術の開発においても、順調に良好な成果を得た。まず、エンコーダネットワークの改良を目指し、アテンション機構における適切な窓幅を入力に応じて決定することで計算量の削減を行うflexible attention法を開発した。本手法は、ニューラル機械翻訳の専門国際ワークショップであるACL Neural Machine Translation Workshopに採択され、Best paper runner-upを受賞した。デコーダネットワークの改良については、ツリー構造を用いてビームサーチよりも効率的に解空間を探索するヒューリスティックアルゴリズムを提案し、論文投稿を行った。さらに、提案手法を大規模化するにあたってボトルネックの一つとなる単語ベクトルの容量削減を行うために、ニューラルネットワークを用いた量子化による合成的コーディング法を提案した。より具体的には、Gumbel-softmax法を応用し離散的なコードの学習を連続値の問題へ緩和することにより、コードと辞書（基底ベクトル）を両方同時に最適化するものである。本手法は、最終的な機械翻訳の精度を落とすことなく、単語ベクトルの容量を90%以上削減できることが示された。本手法に関する研究は、言語処理学会年次大会において最優秀賞を受賞し、深層学習に関するトップ国際会議であるInternational Conference on Learning Representation (ICLR)へ採択された。
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason 本研究課題の目標を達成する基盤的な枠組みは既に開発・評価が完了しており、本分野における著名な国際論文誌での発表も為され、順調に成果が得られていると考える。加えて、本年度は手法全体の更なる完成度・実用性の向上を目指し各要素技術の改良を目標としたが、それぞれ有用な成果が得られ、学会等で高い評価を得ている。特に、単語ベクトルの圧縮に関しては当初想定していなかった興味深い成果であり、本研究課題のみならず深層学習分野全体に大きく貢献できる技術が得られたと考える。
Strategy for Future Research Activity	本研究課題の最終年度となるH30年度は、提案手法の各技術要素をさらに改良し全体としての完成度を高めると同時に、大規模データへ本手法を適用し、研究全体の成果及び知見をまとめる。 (1) ネットワーク圧縮技術の開発と実装：深層学習に基づく提案手法は、単語ベクトル（分散表現）や、テキストを扱うリカレントニューラルネットワーク、画像を扱う畳み込みニューラルネットワーク等に多くのパラメータを有するため、一度にGPUのメモリ上に載せられるデータの量が厳しく制限される。これは計算速度の低下につながり、大規模データへ本手法を適用するための妨げとなっている。このため、これらのパラメータを出来るだけ圧縮することが、手法を真に実用的なものとしていくために必要不可欠である。本年度に開発した単語ベクトル圧縮手法を拡張し、一般的なニューラルネットワークを効率よく圧縮する手法の開発に取り組む。 (2) 高精度デコーダの開発：翻訳文の生成において、一般的なビームサーチによるアルゴリズムは必ずしも良好な解を生成できる保証はなく、マルチモーダルエンコーダの与える入力を活用できていない。本年度は、マルチモーダル空間の構造をより積極的に活用したビームサーチに依らない言語デコーダを開発し、高精度な翻訳文生成を目指す。 (3) 大規模データによる検証：開発した要素技術を統合して完成度を高めた提案手法を大規模なデータに適用し、得られる結果や知見についてまとめる。

Research Products
(20 results)

All 2018 2017 Other

All Journal Article (6 results) (of which Peer Reviewed: 6 results, Open Access: 3 results) Presentation (12 results) (of which Int'l Joint Research: 6 results, Invited: 1 results) Remarks (2 results)

[Journal Article] Augmenting Image Question Answering Dataset by Exploiting Image Captions2018
- Author(s)
  Masashi Yokota and Hideki Nakayama
- Journal Title
  
  Proceedings of International Conference on Language Resources and Evaluation (LREC)
  
  Volume: 印刷中 Pages: 印刷中
- Peer Reviewed
[Journal Article] Compressing Word Embeddings via Deep Compositional Code Learning2018
- Author(s)
  Raphael Shu and Hideki Nakayama
- Journal Title
  
  Proceedings of International Conference on Learning Representations (ICLR)
  
  Volume: 印刷中 Pages: 印刷中
- Peer Reviewed
[Journal Article] Zero-resource Machine Translation by Multimodal Encoder-Decoder Network with Multimedia Pivot2017
- Author(s)
  Hideki Nakayama and Noriki Nishida
- Journal Title
  
  Machine Translation
  
  Volume: 31 Pages: 49-64
- DOI
  10.1007/s10590-017-9197-z
- Peer Reviewed / Open Access
[Journal Article] An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation2017
- Author(s)
  Raphael Shu and Hideki Nakayama
- Journal Title
  
  Proceedings of the First Workshop on Neural Machine Translation
  
  Volume: - Pages: 1-10
- DOI
  10.18653/v1/W17-3201
- Peer Reviewed / Open Access
[Journal Article] Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations2017
- Author(s)
  Noriki Nishida and Hideki Nakayama
- Journal Title
  
  Proceedings of the Eighth International Joint Conference on Natural Language Processing (IJCNLP)
  
  Volume: 1 Pages: 70-79
- Peer Reviewed / Open Access
[Journal Article] Bag of Local Convolutional Triplets for Script Identification in Scene Text2017
- Author(s)
  Jan Zdenek and Hideki Nakayama
- Journal Title
  
  Proceedings of 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
  
  Volume: - Pages: 369-375
- DOI
  10.1109/ICDAR.2017.68
- Peer Reviewed
[Presentation] テキストの局所一貫性に基づく半教師あり暗黙的談話関係認識2018
- Author(s)
  西田典起
- Organizer
  言語処理学会年次大会
[Presentation] 深層コード学習による単語分散表現の圧縮2018
- Author(s)
  朱中元
- Organizer
  言語処理学会年次大会
[Presentation] Augmenting Image Question Answering Dataset by Exploiting Image Captions2018
- Author(s)
  Masashi Yokota
- Organizer
  International Conference on Language Resources and Evaluation (LREC)
- Int'l Joint Research
[Presentation] Compressing Word Embeddings via Deep Compositional Code Learning2018
- Author(s)
  Raphael Shu
- Organizer
  International Conference on Learning Representations (ICLR)
- Int'l Joint Research
[Presentation] An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation2017
- Author(s)
  Raphael Shu
- Organizer
  First Workshop on Neural Machine Translation
- Int'l Joint Research
[Presentation] Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations2017
- Author(s)
  Noriki Nishida
- Organizer
  International Joint Conference on Natural Language Processing (IJCNLP)
- Int'l Joint Research
[Presentation] Bag of Local Convolutional Triplets for Script Identification in Scene Text2017
- Author(s)
  Jan Zdenek
- Organizer
  International Conference on Document Analysis and Recognition (ICDAR)
- Int'l Joint Research
[Presentation] Pivot-based Multimodality Integration for Unsupervised Cross-domain Machine Intelligence2017
- Author(s)
  Hideki Nakayama
- Organizer
  International Symposium on Research and Education of Computational Science (RECS)
- Int'l Joint Research / Invited
[Presentation] 文脈を考慮したアテンションメカニズムの計算量の削減2017
- Author(s)
  朱中元
- Organizer
  人工知能学会全国大会
[Presentation] Script Identification using Bag-of-Words with Entropy-weighted Patches2017
- Author(s)
  ズデニェク・ヤン
- Organizer
  人工知能学会全国大会
[Presentation] シーングラフを用いた質問文生成によるデータ拡張の手法2017
- Author(s)
  横田匡史
- Organizer
  人工知能学会全国大会
[Presentation] Learning Syntactically Plausible Word Representations by Solving Word Ordering2017
- Author(s)
  西田典起
- Organizer
  人工知能学会全国大会
[Remarks] 言語とビジョンの融合に関わる研究成果
- URL
  http://www.nlab.ci.i.u-tokyo.ac.jp/projects/vision_and_language.html
[Remarks] 自然言語処理に関わる研究成果
- URL
  http://www.nlab.ci.i.u-tokyo.ac.jp/projects/nlp.html

2017 Fiscal Year Annual Research Report

Zero-shot machine translation using multimodal deep encoder-decoder networks

Principal Investigator

中山 英樹 東京大学, 大学院情報理工学系研究科, 講師 (00643305)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Augmenting Image Question Answering Dataset by Exploiting Image Captions2018

Author(s)

Journal Title

[Journal Article] Compressing Word Embeddings via Deep Compositional Code Learning2018

Author(s)

Journal Title

[Journal Article] Zero-resource Machine Translation by Multimodal Encoder-Decoder Network with Multimedia Pivot2017

Author(s)

Journal Title

DOI

[Journal Article] An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation2017

Author(s)

Journal Title

DOI

[Journal Article] Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations2017

Author(s)

Journal Title

[Journal Article] Bag of Local Convolutional Triplets for Script Identification in Scene Text2017

Author(s)

Journal Title

DOI

[Presentation] テキストの局所一貫性に基づく半教師あり暗黙的談話関係認識2018

Author(s)

Organizer

[Presentation] 深層コード学習による単語分散表現の圧縮2018

Author(s)

Organizer

[Presentation] Augmenting Image Question Answering Dataset by Exploiting Image Captions2018

Author(s)

Organizer

[Presentation] Compressing Word Embeddings via Deep Compositional Code Learning2018

Author(s)

Organizer

[Presentation] An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation2017

Author(s)

Organizer

[Presentation] Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations2017

Author(s)

Organizer

[Presentation] Bag of Local Convolutional Triplets for Script Identification in Scene Text2017

Author(s)

Organizer

[Presentation] Pivot-based Multimodality Integration for Unsupervised Cross-domain Machine Intelligence2017

Author(s)

Organizer

[Presentation] 文脈を考慮したアテンションメカニズムの計算量の削減2017

Author(s)

Organizer

[Presentation] Script Identification using Bag-of-Words with Entropy-weighted Patches2017

Author(s)

Organizer

[Presentation] シーングラフを用いた質問文生成によるデータ拡張の手法2017

Author(s)

Organizer

[Presentation] Learning Syntactically Plausible Word Representations by Solving Word Ordering2017

Author(s)

Organizer

[Remarks] 言語とビジョンの融合に関わる研究成果

URL

[Remarks] 自然言語処理に関わる研究成果

URL

中山英樹東京大学, 大学院情報理工学系研究科, 講師 (00643305)