Semi supervised word alignment model for parallel corpus

Research Project

Project/Area Number	20500149
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	National Institute of Information and Communications Technology
Principal Investigator	YAMAMOTO Hiroifumi National Institute of Information and Communications Technology, 理工学部, 教授 (00395013)
Co-Investigator(Renkei-kenkyūsha)	SUMITA Eiichiro 情報通信研究機構 (90395020) YASUDA Keiji 情報通信研究機構 (50395018) GOH Ghooi-Ling 情報通信研究機構 (90531616)
Project Period (FY)	2008 – 2010
Project Status	Completed (Fiscal Year 2010)
Budget Amount *help	¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000) Fiscal Year 2010: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2009: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2008: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Keywords	自然言語処理 / アライメント / 多言語化 / 半教師あり学習 / 固有名詞 / 確率付き制約
Research Abstract	The porous of this research is to improve word alignment accuracy in parallel corpus. In this research, not only word information, but also part-of-speech information and sentence structure are used. Semi-supervised approach is used for training, since it is difficult to additional information to all of sentence in corpus. For Japanese, English, and Chinese parallel corpus, semi-supervised aliment method using POS tag, and meaning tag for proper noun is conducted, and its effectiveness is confirmed. Next, sentence structure information is used for alignment, and its effectiveness is also confirmed.

Report

(4 results)

2010 Annual Research Report Final Research Report ( PDF )
2009 Annual Research Report
2008 Annual Research Report

Research Products
(15 results)

All 2011 2010 2009 2008

All Journal Article (5 results) (of which Peer Reviewed: 5 results) Presentation (10 results)

[Journal Article] A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation2009
- Author(s)
  HASHIMOTO Kei, YAMAMOTO Hirofumi, OKUMA Hideo, SUMITA Eiichiro, TOKUDA Keiichi
- Journal Title
  
  IEICE transactions on information and systems 92(12)
  
  Pages: 2386-2393
- NAID
  10026812507
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] Imposing Constraints from the Source Tree on ITG Constraints for SMT2009
- Author(s)
  YAMAMOTO Hirofumi, OKUMA Hideo, SUMITA Eiichiro
- Journal Title
  
  EICE transactions on information and systems 92(9)
  
  Pages: 1762-1770
- NAID
  10026811036
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation2009
- Author(s)
  Chooi-Ling Goh, Eiichiro Sumita
- Journal Title
  
  International Journal of Asian Language Processing Vol.19, No.3
  
  Pages: 109-125
- Related Report
  2010 Final Research Report
- Peer Reviewed
[Journal Article] A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation2009
- Author(s)
  橋本佳
- Journal Title
  
  IEICE TRANSACTIONS on Information and Systems Vol.E92-D, No.12
  
  Pages: 2386-2393
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation2009
- Author(s)
  ゴー・チュイリン
- Journal Title
  
  International Journal of Asian Language Processing Vol.19, No.3
  
  Pages: 109-125
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Presentation] 英日SMTへのHead-Final制約の導入2011
- Author(s)
  西村拓哉,山本博史,大熊英男,村上仁一
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋技術科学大学
- Year and Date
  2011-03-08
- Related Report
  2010 Final Research Report
[Presentation] 英日SMTへのHead-Final制約の導入2011
- Author(s)
  西村拓哉
- Organizer
  言語処理学会第17回年次大会
- Place of Presentation
  豊橋技科大学(愛知)
- Year and Date
  2011-03-08
- Related Report
  2010 Annual Research Report
[Presentation] 統計的機械翻訳における未登録語のグループ化による翻訳2010
- Author(s)
  吉崎大輔,山本博史,大熊英男,匂坂芳典
- Organizer
  言語処理学会第16回年次大会
- Place of Presentation
  東京大学
- Year and Date
  2010-03-10
- Related Report
  2010 Final Research Report
[Presentation] 統計的機械翻訳における未登録語のグループ化による翻訳2010
- Author(s)
  吉崎大輔
- Organizer
  言語処理学会第16回年次大会
- Place of Presentation
  東京大学(東京)
- Year and Date
  2010-03-10
- Related Report
  2009 Annual Research Report
[Presentation] Supervised Word Alignment for Phrase-based Statistical Machine Translation2009
- Author(s)
  ゴー・チュイリン
- Organizer
  言語処理学会第15回年次大会
- Place of Presentation
  鳥取
- Year and Date
  2009-03-05
- Related Report
  2008 Annual Research Report
[Presentation] Supervised Word Alignment for Phrase-based Statistical Machine Translation2009
- Author(s)
  ゴー・チュイリン,隅田英一郎
- Organizer
  言語処理学会第15回年次大会論文集(873-876)
- Place of Presentation
  鳥取大学
- Related Report
  2010 Final Research Report
[Presentation] Guidelines for Chinese-English Word Alignment2008
- Author(s)
  Hongmei ZHAO
- Organizer
  The 4th China Workshop on Machine Translation, CWMT' 2008
- Place of Presentation
  北京(中国)
- Year and Date
  2008-11-27
- Related Report
  2008 Annual Research Report
[Presentation] Imposing Constraints from the Source Tree on ITG Constraints for SMT2008
- Author(s)
  山本博史
- Organizer
  ACL-08 : HLT SSST-2 (The Second Workshop on Syntax Structure in Statistical Translation)
- Place of Presentation
  オハイオ(米国)
- Year and Date
  2008-06-20
- Related Report
  2008 Annual Research Report
[Presentation] Guidelines for Chinese-English Word Alignment2008
- Author(s)
  Hongmei Zhao, Qun Liu, Ruiqiang Zhang, Yajuan Lv, 隅田英一郎, ゴー・チュイリン
- Organizer
  CWMT'2008論文集(153-163)
- Place of Presentation
  北京(中国)
- Related Report
  2010 Final Research Report
[Presentation] Imposing Constraints from the Source Tree on ITG Constraints for SMT2008
- Author(s)
  山本博史,大熊英男,隅田英一郎
- Organizer
  ACL-08 : HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)
- Place of Presentation
  オハイオ(米国)
- Related Report
  2010 Final Research Report

Semi supervised word alignment model for parallel corpus

Principal Investigator

YAMAMOTO Hiroifumi National Institute of Information and Communications Technology, 理工学部, 教授 (00395013)

¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)

Report

Research Products

[Journal Article] A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Imposing Constraints from the Source Tree on ITG Constraints for SMT2009

Author(s)

Journal Title

NAID

Related Report

[Journal Article] A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation2009

Author(s)

Journal Title

Related Report

[Journal Article] A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation2009

Author(s)

Journal Title

Related Report

[Journal Article] A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation2009

Author(s)

Journal Title

Related Report

[Presentation] 英日SMTへのHead-Final制約の導入2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 英日SMTへのHead-Final制約の導入2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 統計的機械翻訳における未登録語のグループ化による翻訳2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 統計的機械翻訳における未登録語のグループ化による翻訳2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Supervised Word Alignment for Phrase-based Statistical Machine Translation2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Supervised Word Alignment for Phrase-based Statistical Machine Translation2009

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] Guidelines for Chinese-English Word Alignment2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Imposing Constraints from the Source Tree on ITG Constraints for SMT2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Guidelines for Chinese-English Word Alignment2008

Author(s)

Organizer

Place of Presentation

Related Report