2020 Fiscal Year Research-status Report
Neural Machine Translation Based on Bilingual Resources Extracted from Multimodal Data
Project/Area Number |
19K20343
|
Research Institution | Kyoto University |
Principal Investigator |
チョ シンキ 京都大学, 情報学研究科, 特定准教授 (70784891)
|
Project Period (FY) |
2019-04-01 – 2022-03-31
|
Keywords | 機械翻訳 / マルチモーダル |
Outline of Annual Research Achievements |
In FY 2020, we mainly studied the following to improve promote multimodal neural machine translation (NNMT). 1. MNMT with comparable sentences. We propose a new multimodal English-Japanese corpus with comparable sentences that are compiled from existing image captioning datasets. In addition, we supplement our comparable sentences with a smaller parallel corpus for validation and test purposes. To test the performance of this comparable sentence translation scenario, we train several baseline NMT models with our comparable corpus and evaluate their English-Japanese translation performance. 2. MNMT with word-region alignment (WRA). We propose MNMT-WRA focus on semantically relevant image regions during translation. This study advances the semantic correlation between textual and visual modalities in MNMT by integrating WRA. Experimental results on the widely used Multi30k dataset show that our model significantly improves over competitive baselines. 3. Video guided MT (VMT). In this work, we propose our VMT system by using both temporal and spatial representations in a video to cope with both the motion ambiguity problem as well as the object ambiguity problem. To obtain spatial features efficiently, we propose to use a hierarchical attention network encoder to model the spatial information from object-level to video-level. Experiments on the VATEX dataset show improvement over a strong baseline method.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
We planed to study the following items in FY 2020 and finished them as scheduled: 1. MT with parallel sentences and image/region representation fusion. 2. NMT with comparable sentences.
|
Strategy for Future Research Activity |
1. Improve MNMT with parallel and comparable sentences. Although we have shown that our MNMT system with parallel sentences can improve MT with image regions, the improvement is not significant; for which we plan to design novel models to address. Our MNMT system with comparable sentences are still baseline level, for which we plan to design specific MNMT models for comparable sentences. 2. Improve VMT. The current VATEX validation and test sets contain many noisy sentence pairs. We plan to improve the quality of them via post-editing. After that, we will improve our current model towards better VMT.
|
Research Products
(18 results)