2020 年度実施状況報告書

マルチモーダルデータからの対訳資源の抽出によるニューラル機械翻訳

研究課題

研究課題/領域番号	19K20343
研究機関	京都大学
研究代表者	チョシンキ京都大学, 情報学研究科, 特定准教授 (70784891)
研究期間 (年度)	2019-04-01 – 2022-03-31
キーワード	機械翻訳 / マルチモーダル
研究実績の概要	In FY 2020, we mainly studied the following to improve promote multimodal neural machine translation (NNMT). 1. MNMT with comparable sentences. We propose a new multimodal English-Japanese corpus with comparable sentences that are compiled from existing image captioning datasets. In addition, we supplement our comparable sentences with a smaller parallel corpus for validation and test purposes. To test the performance of this comparable sentence translation scenario, we train several baseline NMT models with our comparable corpus and evaluate their English-Japanese translation performance. 2. MNMT with word-region alignment (WRA). We propose MNMT-WRA focus on semantically relevant image regions during translation. This study advances the semantic correlation between textual and visual modalities in MNMT by integrating WRA. Experimental results on the widely used Multi30k dataset show that our model significantly improves over competitive baselines. 3. Video guided MT (VMT). In this work, we propose our VMT system by using both temporal and spatial representations in a video to cope with both the motion ambiguity problem as well as the object ambiguity problem. To obtain spatial features efficiently, we propose to use a hierarchical attention network encoder to model the spatial information from object-level to video-level. Experiments on the VATEX dataset show improvement over a strong baseline method.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 We planed to study the following items in FY 2020 and finished them as scheduled: 1. MT with parallel sentences and image/region representation fusion. 2. NMT with comparable sentences.
今後の研究の推進方策	1. Improve MNMT with parallel and comparable sentences. Although we have shown that our MNMT system with parallel sentences can improve MT with image regions, the improvement is not significant; for which we plan to design novel models to address. Our MNMT system with comparable sentences are still baseline level, for which we plan to design specific MNMT models for comparable sentences. 2. Improve VMT. The current VATEX validation and test sets contain many noisy sentence pairs. We plan to improve the quality of them via post-editing. After that, we will improve our current model towards better VMT.

研究成果
(18件)

すべて 2021 2020 その他

すべて国際共同研究 (3件) 雑誌論文 (5件) (うち国際共著 3件、オープンアクセス 5件) 学会発表 (9件) (うち国際学会 2件) 備考 (1件)

[国際共同研究] 上海交通大学(中国)
- 国名
  中国
- 外国機関名
  上海交通大学
[国際共同研究] Microsoft, Hyderabad(インド)
- 国名
  インド
- 外国機関名
  Microsoft, Hyderabad
[国際共同研究] University of Georgia(米国)
- 国名
  米国
- 外国機関名
  University of Georgia
[雑誌論文] Preordering Encoding on Transformer for Translation2021
- 著者名/発表者名
  Kawara Yuki、Chu Chenhui、Arase Yuki
- 雑誌名
  
  IEEE/ACM Transactions on Audio, Speech, and Language Processing
  
  巻: 29 ページ: 644～655
- DOI
  10.1109/taslp.2020.3042001
- オープンアクセス
[雑誌論文] A Survey of Multilingual Neural Machine Translation2020
- 著者名/発表者名
  Dabre Raj、Chu Chenhui、Kunchukuttan Anoop
- 雑誌名
  
  ACM Computing Surveys
  
  巻: 53 ページ: 1～38
- DOI
  10.1145/3406095
- オープンアクセス / 国際共著
[雑誌論文] A Survey of Domain Adaptation for Machine Translation2020
- 著者名/発表者名
  Chu Chenhui、Wang Rui
- 雑誌名
  
  Journal of Information Processing
  
  巻: 28 ページ: 413～426
- DOI
  10.2197/ipsjjip.28.413
- オープンアクセス / 国際共著
[雑誌論文] A Corpus for English-Japanese Multimodal Neural Machine Translation with Comparable Sentences2020
- 著者名/発表者名
  Andrew Merritt, Chenhui Chu, Yuki Arase
- 雑誌名
  
  arXiv:2010.08725
  
  巻: - ページ: -
- オープンアクセス / 国際共著
[雑誌論文] Lexically Cohesive Neural Machine Translation with Copy Mechanism2020
- 著者名/発表者名
  Vipul Mishra, Chenhui Chu, Yuki Arase
- 雑誌名
  
  arXiv:2010.05193
  
  巻: - ページ: -
- オープンアクセス
[学会発表] Multilingual Neural Machine Translation (Tutorial)2020
- 著者名/発表者名
  Raj Dabre , Chenhui Chu , Anoop Kunchukuttan
- 学会等名
  In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)
- 国際学会
[学会発表] Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions2020
- 著者名/発表者名
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- 学会等名
  In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020)
- 国際学会
[学会発表] Video-guided Machine Translation with Spatial Hierarchical Attention Network Encoder2020
- 著者名/発表者名
  Weiqi Gu, Haiyue Song, Chenhui Chu, Sadao Kurohashi
- 学会等名
  言語処理学会第27回年次大会
[学会発表] Self-supervised Dynamic Programming Encoding for Neural Machine Translation2020
- 著者名/発表者名
  Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi, Eiichiro Sumita
- 学会等名
  言語処理学会第27回年次大会
[学会発表] Learning Cross-lingual Sentence Representations for Multilingual Document Classification with Token-level Reconstruction2020
- 著者名/発表者名
  Zhuoyuan Mao, Prakhar Gupta, Chenhui Chu, Martin Jaggi, Sadao Kurohashi
- 学会等名
  言語処理学会第27回年次大会
[学会発表] Non-Autoregressive Translationモデルにおける事前並び替え適用手法の検討2020
- 著者名/発表者名
  瓦祐希, Chenhui Chu, 荒瀬由紀
- 学会等名
  言語処理学会第27回年次大会
[学会発表] End-to-End Speech Translation with Cross-lingual Transfer Learning2020
- 著者名/発表者名
  Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
- 学会等名
  言語処理学会第27回年次大会
[学会発表] Neural Machine Translation with Semantic Relevant Image Regions2020
- 著者名/発表者名
  Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
- 学会等名
  言語処理学会第27回年次大会
[学会発表] 日本語話し言葉書き言葉変換による大学講義の日英翻訳の精度向上2020
- 著者名/発表者名
  中尾亮太, Chenhui Chu, 黒橋禎夫
- 学会等名
  言語処理学会第27回年次大会
[備考]
- URL
  https://researchmap.jp/chu/

2020 年度 実施状況報告書

マルチモーダルデータからの対訳資源の抽出によるニューラル機械翻訳

研究代表者

チョ シンキ 京都大学, 情報学研究科, 特定准教授 (70784891)

現在までの達成度 (区分)

理由

研究成果

[国際共同研究] 上海交通大学(中国)

国名

外国機関名

[国際共同研究] Microsoft, Hyderabad(インド)

国名

外国機関名

[国際共同研究] University of Georgia(米国)

国名

外国機関名

[雑誌論文] Preordering Encoding on Transformer for Translation2021

著者名/発表者名

雑誌名

DOI

[雑誌論文] A Survey of Multilingual Neural Machine Translation2020

著者名/発表者名

雑誌名

DOI

[雑誌論文] A Survey of Domain Adaptation for Machine Translation2020

著者名/発表者名

雑誌名

DOI

[雑誌論文] A Corpus for English-Japanese Multimodal Neural Machine Translation with Comparable Sentences2020

著者名/発表者名

雑誌名

[雑誌論文] Lexically Cohesive Neural Machine Translation with Copy Mechanism2020

著者名/発表者名

雑誌名

[学会発表] Multilingual Neural Machine Translation (Tutorial)2020

著者名/発表者名

学会等名

[学会発表] Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions2020

著者名/発表者名

学会等名

[学会発表] Video-guided Machine Translation with Spatial Hierarchical Attention Network Encoder2020

著者名/発表者名

学会等名

[学会発表] Self-supervised Dynamic Programming Encoding for Neural Machine Translation2020

著者名/発表者名

学会等名

[学会発表] Learning Cross-lingual Sentence Representations for Multilingual Document Classification with Token-level Reconstruction2020

著者名/発表者名

学会等名

[学会発表] Non-Autoregressive Translationモデルにおける事前並び替え適用手法の検討2020

著者名/発表者名

学会等名

[学会発表] End-to-End Speech Translation with Cross-lingual Transfer Learning2020

著者名/発表者名

学会等名

[学会発表] Neural Machine Translation with Semantic Relevant Image Regions2020

著者名/発表者名

学会等名

[学会発表] 日本語話し言葉書き言葉変換による大学講義の日英翻訳の精度向上2020

著者名/発表者名

学会等名

[備考]

URL

2020 年度実施状況報告書

チョシンキ京都大学, 情報学研究科, 特定准教授 (70784891)