• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2017 Fiscal Year Final Research Report

Language productivity: fast extraction of productive analogical clusters and their evaluation using statistical machine translation

Research Project

  • PDF
Project/Area Number 15K00317
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionWaseda University

Principal Investigator

LEPAGE YVES  早稲田大学, 理工学術院(情報生産システム研究科・センター), 教授 (70573608)

Research Collaborator YANG Wei  
FAM Rashel  
SUSANTI GOJALI  
Project Period (FY) 2015-04-01 – 2018-03-31
Keywords自然言語処理 / 人工知能 / データ構造 / 形態で豊かな言語 / 中国語・日本語
Outline of Final Research Achievements

The goal of the project was 1/ to build tools to produce analogical clusters from monolingual data, 2/ to use such clusters in the production of quasi-parallel corpora, 3/ to use such quasi-parallel corpora in addition to parallel corpora 4/ to obtain improvements in translation accuracy in statistical machine translation (SMT).
Tools were built and publicly released. In addition to what was announced in the research plan, a new data structure, analogical grid was introduced. Data were produced in morphologically poor to rich languages: 11 European languages (N-grams from word to 6-grams), Chinese, Japanese (short sentences of less than 30 characters for SMT experiments), and additional languages (word forms in Arabic, Georgian, Navajo, Russian, Turkish, etc.). Part of the data has been publicly released.
Various experiments showed that it is possible to improve translation accuracy thanks to quasi-parallel data produced by analogy, and filtered, in SMT for Chinese-Japanese.

Free Research Field

自然言語処理

URL: 

Published: 2019-03-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi