• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2017 Fiscal Year Annual Research Report

Multiple resource adaptation for low resource neural machine translation

Research Project

Project/Area Number 17H06822
Research InstitutionOsaka University

Principal Investigator

チョ シンキ  大阪大学, データビリティフロンティア機構, 特任助教(常勤) (70784891)

Project Period (FY) 2017-08-25 – 2019-03-31
Keywords機械翻訳 / ローリソース / ドメイン適応 / ニューラル機械翻訳
Outline of Annual Research Achievements

In Japan, because of the rapid increase of foreign tourists and the host of the 2020 Tokyo Olympic Games, translation needs are rapidly growing, making machine translation (MT) indispensable. In MT, the translation knowledge is acquired from parallel corpora (sentence-aligned bilingual texts). However, as parallel corpora between Japanese and most languages (e.g., Japanese-Indonesian) and domains (e.g., medical domain) are very scarce (only tens of thousands of parallel sentences or fewer), the translation quality is not satisfied. Improving MT quality in this low resource scenario is a challenging unsolved problem.
The purpose of this research is improving MT quality in this low resource scenario using multiple resources, including parallel corpora of resource rich languages (such as French-English) and domains (such as the parliamentary domain), and large-scale monolingual web corpora. In FY2017, we established model adaptation technologies using resource rich language and domain parallel corpora. Specifically, we obtained the following achievements:
1. Single language/domain adaptation. We developed novel methods and conducted a comprehensive empirical comparison of previous studies. Our research achievements have been published at ACL 2017 (the top conference in natural language processing) and accepted to be published in the Journal of Information Processing in June.
2. Multiple language/domain adaptation. We also developed methods for domain adaptation using multilingual and multi-domain corpora, and presented our work at NLP 2018.

Current Status of Research Progress
Current Status of Research Progress

2: Research has progressed on the whole more than it was originally planned.

Reason

This research is divided into three sub-topics: 1. Model adaptation using resource rich language and domain parallel corpora; 2. Data adaptation using large-scale monolingual web corpora; 3. Multiple resource adapted system integration. In FY2017, we established the model adaptation technology based on both resource rich language and domain parallel corpora as scheduled.

Strategy for Future Research Activity

We will study the remaining two topics: data adaptation using large-scale monolingual web corpora and multiple resource adapted system integration as scheduled. In our journal paper, which will be published in the Journal of Information Processing in June, we actually have conducted a comparison of previous studies in these two topics. In addition, we wrote a survey paper of domain adaptation for neural machine translation and submitted it to COLING 2018 (a top conference in natural language processing). We believe that these preliminary studies will make our research in FY2018 smooth.

  • Research Products

    (8 results)

All 2018 2017 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results,  Open Access: 2 results) Presentation (5 results) (of which Int'l Joint Research: 1 results) Remarks (1 results)

  • [Journal Article] A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation2018

    • Author(s)
      Chenhui Chu, Raj Dabre and Sadao Kurohashi
    • Journal Title

      情報処理学会論文誌

      Volume: 26 Pages: N/A

    • Peer Reviewed / Open Access
  • [Journal Article] Constrained Partial Parsing Based Dependency Tree Projection for Tree-to-Tree Machine Translation2017

    • Author(s)
      Chenhui Chu, Yu Shen, Fabien Cromieres and Sadao Kurohashi
    • Journal Title

      自然言語処理

      Volume: 24(2) Pages: 267-296

    • DOI

      https://doi.org/10.11185/imt.12.172

    • Peer Reviewed / Open Access
  • [Presentation] ニューラル機械翻訳における単語予測の重要性について2018

    • Author(s)
      竹林 佑斗, Chenhui Chu, 荒瀬由紀, 永田 昌明
    • Organizer
      2018年度人工知能学会全国大会
  • [Presentation] Multilingual and Multi-Domain Adaptation for Neural Machine Translation2018

    • Author(s)
      Chenhui Chu and Raj Dabre
    • Organizer
      言語処理学会 第24回年次大会
  • [Presentation] Recursive Neural Networkを用いた事前並び替えによる英日機械翻訳2018

    • Author(s)
      瓦祐希, Chenhui Chu, 荒瀬由紀
    • Organizer
      言語処理学会 第24回年次大会
  • [Presentation] An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation2017

    • Author(s)
      Chenhui Chu, Raj Dabre and Sadao Kurohashi
    • Organizer
      Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics
    • Int'l Joint Research
  • [Presentation] An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation2017

    • Author(s)
      Chenhui Chu, Raj Dabre and Sadao Kurohashi
    • Organizer
      言語処理学会 第23回年次大会
  • [Remarks] 研究者個人ホームページ

    • URL

      https://researchmap.jp/chu/

URL: 

Published: 2018-12-17   Modified: 2019-12-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi