研究実績の概要 |
In FY 2020, we mainly studied the following to improve promote multimodal neural machine translation (NNMT). 1. MNMT with comparable sentences. We propose a new multimodal English-Japanese corpus with comparable sentences that are compiled from existing image captioning datasets. In addition, we supplement our comparable sentences with a smaller parallel corpus for validation and test purposes. To test the performance of this comparable sentence translation scenario, we train several baseline NMT models with our comparable corpus and evaluate their English-Japanese translation performance. 2. MNMT with word-region alignment (WRA). We propose MNMT-WRA focus on semantically relevant image regions during translation. This study advances the semantic correlation between textual and visual modalities in MNMT by integrating WRA. Experimental results on the widely used Multi30k dataset show that our model significantly improves over competitive baselines. 3. Video guided MT (VMT). In this work, we propose our VMT system by using both temporal and spatial representations in a video to cope with both the motion ambiguity problem as well as the object ambiguity problem. To obtain spatial features efficiently, we propose to use a hierarchical attention network encoder to model the spatial information from object-level to video-level. Experiments on the VATEX dataset show improvement over a strong baseline method.
|
今後の研究の推進方策 |
1. Improve MNMT with parallel and comparable sentences. Although we have shown that our MNMT system with parallel sentences can improve MT with image regions, the improvement is not significant; for which we plan to design novel models to address. Our MNMT system with comparable sentences are still baseline level, for which we plan to design specific MNMT models for comparable sentences. 2. Improve VMT. The current VATEX validation and test sets contain many noisy sentence pairs. We plan to improve the quality of them via post-editing. After that, we will improve our current model towards better VMT.
|