2010 Fiscal Year Final Research Report
Semi supervised word alignment model for parallel corpus
Project/Area Number |
20500149
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
YAMAMOTO Hiroifumi National Institute of Information and Communications Technology, 理工学部, 教授 (00395013)
|
Co-Investigator(Renkei-kenkyūsha) |
SUMITA Eiichiro 情報通信研究機構 (90395020)
YASUDA Keiji 情報通信研究機構 (50395018)
GOH Ghooi-Ling 情報通信研究機構 (90531616)
|
Project Period (FY) |
2008 – 2010
|
Keywords | 自然言語処理 |
Research Abstract |
The porous of this research is to improve word alignment accuracy in parallel corpus. In this research, not only word information, but also part-of-speech information and sentence structure are used. Semi-supervised approach is used for training, since it is difficult to additional information to all of sentence in corpus. For Japanese, English, and Chinese parallel corpus, semi-supervised aliment method using POS tag, and meaning tag for proper noun is conducted, and its effectiveness is confirmed. Next, sentence structure information is used for alignment, and its effectiveness is also confirmed.
|
Research Products
(8 results)