Neural Machine Translation Based on Bilingual Resources Extracted from Multimodal Data

Research Project

Project/Area Number	19K20343
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Kyoto University (2020-2021) Osaka University (2019)
Principal Investigator	Chu Chenhui 京都大学, 情報学研究科, 特定准教授 (70784891)
Project Period (FY)	2019-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000) Fiscal Year 2021: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2020: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2019: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Keywords	機械翻訳 / マルチモーダル / ニューラル機械翻訳 / マルチモーダルデータ / 対訳資源
Outline of Research at the Start	In machine translation (MT), the translation knowledge is acquired from parallel corpora (sentence-aligned bilingual texts). However, domain specific parallel corpora are usually scarce or nonexistent in most languages, and thus MT performs poorly in such scenarios. We aim to address this problem based on the state-of-the-art neural MT. Our core idea is extracting parallel data from multimodal data consisting of images and multilingual describing text, which is widely available from the web and social media and studying NMT using the extracted parallel data.
Outline of Final Research Achievements	In this project, we mainly studied the following topics for multimodal neural machine translation (NNMT). 1). MNMT with comparable sentences. We constructed an MNMT with comparable sentences dataset and organized a shared task in the 8th Workshop on Asian Translation (WAT 2021). Our system achieved the best performance in this shared task. 2). MNMT with semantic image regions and word-region alignment. We studied MNMT with semantic image regions and word-region alignment and published them in two famous international journals Neurocomputing and TASLP. 3). Video-guided MT (VMT). We proposed VMT with a spatial hierarchical attention network, which can address both verb and noun sense disambiguation.
Academic Significance and Societal Importance of the Research Achievements	機械翻訳における自然言語の意味曖昧性解消を目的として、マルチモーダルニューラル機械翻訳(MNMT)が主に研究されている。本プロジェクトでは、低資源な設定でコンパラブル文を用いたMNMTという新しい仕組みを考案し、画像を用いたMNMTにおいてはセマンティック画像領域と単語領域アライメントを用いたMNMTを提案し、映像を用いたMNMTにおいては空間階層注意ネットワークを提案し、機械翻訳における視覚情報の利用の有効性を示した。開発したMNMTシステムは映画、ドラマ、アニメやニュースなどの字幕の自動翻訳の精度向上に貢献できるし、大阪万博などの国際的イベントでの自動翻訳ニューズにも応えられる。

Report

(4 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report
2019 Research-status Report

Research Products
(40 results)

All 2022 2021 2020 2019 Other

All Int'l Joint Research (4 results) Journal Article (12 results) (of which Int'l Joint Research: 3 results, Peer Reviewed: 4 results, Open Access: 12 results) Presentation (21 results) (of which Int'l Joint Research: 8 results, Invited: 1 results) Remarks (3 results)

[Int'l Joint Research] EPFL(スイス)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] 上海交通大学(中国)
- Related Report
  2020 Research-status Report
[Int'l Joint Research] Microsoft, Hyderabad(インド)
- Related Report
  2020 Research-status Report
[Int'l Joint Research] University of Georgia(米国)
- Related Report
  2020 Research-status Report
[Journal Article] Region-attentive multimodal neural machine translation2022
- Author(s)
  Zhao Yuting、Komachi Mamoru、Kajiwara Tomoyuki、Chu Chenhui
- Journal Title
  
  Neurocomputing
  
  Volume: 476 Pages: 1-13
- DOI
  10.1016/j.neucom.2021.12.076
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Word-Region Alignment-Guided Multimodal Neural Machine Translation2022
- Author(s)
  Zhao Yuting、Komachi Mamoru、Kajiwara Tomoyuki、Chu Chenhui
- Journal Title
  
  IEEE/ACM Transactions on Audio, Speech, and Language Processing
  
  Volume: 30 Pages: 244-259
- DOI
  10.1109/taslp.2021.3138719
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Spoken-Written Japanese Conversion for Japanese-English University-Lecture Translation2021
- Author(s)
  Nakao Ryota、Chu Chenhui、Kurohashi Sadao
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 28 Issue: 4 Pages: 1034-1052
- DOI
  10.5715/jnlp.28.1034
- NAID
  130008129491
- ISSN
  1340-7619, 2185-8314
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Preordering Encoding on Transformer for Translation2021
- Author(s)
  Kawara Yuki、Chu Chenhui、Arase Yuki
- Journal Title
  
  IEEE/ACM Transactions on Audio, Speech, and Language Processing
  
  Volume: 29 Pages: 644-655
- DOI
  10.1109/taslp.2020.3042001
- Related Report
  2020 Research-status Report
- Open Access
[Journal Article] A Survey of Multilingual Neural Machine Translation2020
- Author(s)
  Dabre Raj、Chu Chenhui、Kunchukuttan Anoop
- Journal Title
  
  ACM Computing Surveys
  
  Volume: 53 Issue: 5 Pages: 1-38
- DOI
  10.1145/3406095
- Related Report
  2020 Research-status Report
- Open Access / Int'l Joint Research
[Journal Article] A Survey of Domain Adaptation for Machine Translation2020
- Author(s)
  Chu Chenhui、Wang Rui
- Journal Title
  
  Journal of Information Processing
  
  Volume: 28 Issue: 0 Pages: 413-426
- DOI
  10.2197/ipsjjip.28.413
- NAID
  130007887723
- ISSN
  1882-6652
- Related Report
  2020 Research-status Report
- Open Access / Int'l Joint Research
[Journal Article] A Corpus for English-Japanese Multimodal Neural Machine Translation with Comparable Sentences2020
- Author(s)
  Andrew Merritt, Chenhui Chu, Yuki Arase
- Journal Title
  
  arXiv:2010.08725
  
  Volume: -
- Related Report
  2020 Research-status Report
- Open Access / Int'l Joint Research
[Journal Article] Lexically Cohesive Neural Machine Translation with Copy Mechanism2020
- Author(s)
  Vipul Mishra, Chenhui Chu, Yuki Arase
- Journal Title
  
  arXiv:2010.05193
  
  Volume: -
- Related Report
  2020 Research-status Report
- Open Access
[Journal Article] A Comprehensive Survey of Multilingual Neural Machine Translation2020
- Author(s)
  Raj Dabre, Chenhui Chu, Anoop Kunchukuttan
- Journal Title
  
  arXiv:2001.01115
  
  Volume: －
- Related Report
  2019 Research-status Report
- Open Access
[Journal Article] Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation2019
- Author(s)
  Chenhui Chu and Raj Dabre
- Journal Title
  
  arXiv:1906.07978
  
  Volume: －
- Related Report
  2019 Research-status Report
- Open Access
[Journal Article] A Survey of Multilingual Neural Machine Translation2019
- Author(s)
  Raj Dabre, Chenhui Chu, Anoop Kunchukuttan
- Journal Title
  
  arXiv:1905.05395
  
  Volume: －
- Related Report
  2019 Research-status Report
- Open Access
[Journal Article] ニューラル機械翻訳における単語報酬モデルに基づく対訳辞書の利用2019
- Author(s)
  竹林佑斗, Chenhui Chu, 荒瀬由紀, 永田昌明
- Journal Title
  
  自然言語処理
  
  Volume: 26 Pages: 711-731
- Related Report
  2019 Research-status Report
- Peer Reviewed / Open Access
[Presentation] 曖昧性を含む翻訳に着目したマルチモーダル機械翻訳データセットの構築方法の検討2022
- Author(s)
  Yihang Li, 清水周一郎, Chenhui Chu, 黒橋禎夫
- Organizer
  言語処理学会第28回年次大会
- Related Report
  2021 Annual Research Report
[Presentation] Representative Data Selection for Sequence-to-Sequence Pre-training2022
- Author(s)
  Haiyue Song, Raj Dabre, Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi
- Organizer
  言語処理学会第28回年次大会
- Related Report
  2021 Annual Research Report
[Presentation] TMEKU System for the WAT2021 Multimodal Translation Task2021
- Author(s)
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- Organizer
  In Proceedings of the 8th Workshop on Asian Translation (WAT2021)
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Overview of the 8th Workshop on Asian Translation2021
- Author(s)
  Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondrej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi
- Organizer
  In Proceedings of the 8th Workshop on Asian Translation (WAT2021)
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Video-guided Machine Translation with Spatial Hierarchical Attention Network2021
- Author(s)
  Weiqi Gu, Haiyue Song, Chenhui Chu, Sadao Kurohashi
- Organizer
  In Proceedings of the ACL-IJCNLP 2021 Student Research Workshop
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Lightweight Cross-Lingual Sentence Representation Learning2021
- Author(s)
  Zhuoyuan Mao, Prakhar Gupta, Chenhui Chu, Martin Jaggi, Sadao Kurohashi
- Organizer
  In Proceedings of ACL-IJCNLP 2021
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Multilingual Neural Machine Translation (Tutorial)2020
- Author(s)
  Raj Dabre , Chenhui Chu , Anoop Kunchukuttan
- Organizer
  In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions2020
- Author(s)
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- Organizer
  In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Video-guided Machine Translation with Spatial Hierarchical Attention Network Encoder2020
- Author(s)
  Weiqi Gu, Haiyue Song, Chenhui Chu, Sadao Kurohashi
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] Self-supervised Dynamic Programming Encoding for Neural Machine Translation2020
- Author(s)
  Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi, Eiichiro Sumita
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] Learning Cross-lingual Sentence Representations for Multilingual Document Classification with Token-level Reconstruction2020
- Author(s)
  Zhuoyuan Mao, Prakhar Gupta, Chenhui Chu, Martin Jaggi, Sadao Kurohashi
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] Non-Autoregressive Translationモデルにおける事前並び替え適用手法の検討2020
- Author(s)
  瓦祐希, Chenhui Chu, 荒瀬由紀
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] End-to-End Speech Translation with Cross-lingual Transfer Learning2020
- Author(s)
  Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] Neural Machine Translation with Semantic Relevant Image Regions2020
- Author(s)
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] 日本語話し言葉書き言葉変換による大学講義の日英翻訳の精度向上2020
- Author(s)
  中尾亮太, Chenhui Chu, 黒橋禎夫
- Organizer
  言語処理学会第27回年次大会
- Related Report
  2020 Research-status Report
[Presentation] 事前並び替え位置表現を用いたTransformerによる日英機械翻訳2020
- Author(s)
  瓦祐希, Chenhui Chu, 荒瀬由紀
- Organizer
  言語処理学会第26回年次大会
- Related Report
  2019 Research-status Report
[Presentation] Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions2019
- Author(s)
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- Organizer
  情報処理学会第241回自然言語処理研究会/NLP若手の会 (YANS) 第14回シンポジウム
- Related Report
  2019 Research-status Report
[Presentation] Comparison and Analysis of 2-to-2 and Hierarchical RNN Models on Japanese-to-English Context-Aware Translation2019
- Author(s)
  Mishra Vipul, Yuki Kawara, Chenhui Chu, Yuki Arase
- Organizer
  NLP若手の会 (YANS) 第14回シンポジウム
- Related Report
  2019 Research-status Report
[Presentation] 事前並び替えによる英日Transformerモデルの翻訳精度向上2019
- Author(s)
  瓦祐希, Chenhui Chu, 荒瀬由紀
- Organizer
  NLP若手の会 (YANS) 第14回シンポジウム
- Related Report
  2019 Research-status Report
[Presentation] Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation2019
- Author(s)
  Raj Dabre, Atsushi Fujita, Chenhui Chu
- Organizer
  In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)
- Related Report
  2019 Research-status Report
- Int'l Joint Research
[Presentation] Domain Adaptation for Neural Machine Translation2019
- Author(s)
  Chenhui Chu and Rui Wang
- Organizer
  The 15th China Conference on Machine Translation (CCMT 2019)
- Related Report
  2019 Research-status Report
- Int'l Joint Research / Invited
[Remarks] https://researchmap.jp/chu/
- Related Report
  2021 Annual Research Report
[Remarks]
- URL
  https://researchmap.jp/chu/
- Related Report
  2020 Research-status Report
[Remarks] https://researchmap.jp/chu/
- Related Report
  2019 Research-status Report

Neural Machine Translation Based on Bilingual Resources Extracted from Multimodal Data

Principal Investigator

Chu Chenhui 京都大学, 情報学研究科, 特定准教授 (70784891)

¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)

Report

Research Products

[Int'l Joint Research] EPFL(スイス)

Related Report

[Int'l Joint Research] 上海交通大学(中国)

Related Report

[Int'l Joint Research] Microsoft, Hyderabad(インド)

Related Report

[Int'l Joint Research] University of Georgia(米国)

Related Report

[Journal Article] Region-attentive multimodal neural machine translation2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Word-Region Alignment-Guided Multimodal Neural Machine Translation2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Spoken-Written Japanese Conversion for Japanese-English University-Lecture Translation2021

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Preordering Encoding on Transformer for Translation2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] A Survey of Multilingual Neural Machine Translation2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] A Survey of Domain Adaptation for Machine Translation2020

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] A Corpus for English-Japanese Multimodal Neural Machine Translation with Comparable Sentences2020

Author(s)

Journal Title

Related Report

[Journal Article] Lexically Cohesive Neural Machine Translation with Copy Mechanism2020

Author(s)

Journal Title

Related Report

[Journal Article] A Comprehensive Survey of Multilingual Neural Machine Translation2020

Author(s)

Journal Title

Related Report

[Journal Article] Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation2019

Author(s)

Journal Title

Related Report

[Journal Article] A Survey of Multilingual Neural Machine Translation2019

Author(s)

Journal Title

Related Report

[Journal Article] ニューラル機械翻訳における単語報酬モデルに基づく対訳辞書の利用2019

Author(s)

Journal Title

Related Report

[Presentation] 曖昧性を含む翻訳に着目したマルチモーダル機械翻訳データセットの構築方法の検討2022

Author(s)

Organizer

Related Report

[Presentation] Representative Data Selection for Sequence-to-Sequence Pre-training2022

Author(s)

Organizer

Related Report