Building a Japanese Parsed Corpus

Research Project

Project/Area Number	07558046
Research Category	Grant-in-Aid for Scientific Research (A)
Allocation Type	Single-year Grants
Section	試験
Research Field	Intelligent informatics
Research Institution	KYOTO UNIVERSITY
Principal Investigator	NAGAO Makoto Kyoto University, Department of Electronics and Communication, Professor, 工学研究科, 教授 (30025960)
Co-Investigator(Kenkyū-buntansha)	TSUNODA Tatsuhiko Kyoto University, Department of Electronics and Communication, Instructor, 工学研究科, 助手 (10273468) MARUYAMA Hiroshi IBM Japan, Ltd., Tokyo Research Laboratory, Researcher, 東京基礎研究所, 研究員 KUROHASHI Sadao Kyoto University, Department of Electronics and Communication, Instructor, 工学研究科, 助手 (50263108)
Project Period (FY)	1995 – 1996
Project Status	Completed (Fiscal Year 1996)
Budget Amount *help	¥6,300,000 (Direct Cost: ¥6,300,000) Fiscal Year 1996: ¥2,200,000 (Direct Cost: ¥2,200,000) Fiscal Year 1995: ¥4,100,000 (Direct Cost: ¥4,100,000)
Keywords	Natural Language Processing / Text Corpus / Morphological Analysis / Parsing
Research Abstract	The goal of the project was to construct a Japnese parsed corpus and to simultaneously improve a morphological analyzer and a parser. In the period of two years' project, we have achieved the following results : (a) We enhanced our morphological analyzer JUMAN to handle a word string as a whole, and to find and enter problematic fixed expressions which were analyzed incorrectly by the normal morphological analysis. We released the enhanced version of JUMAN,JUMAN3.0 in October 1996. (b) We enhanced the treatment of coordination structures and subordinate structures in our parser, KNP.KNP was also enhanced to handle several types of phrases with exceptional sentential functions. We released the enhanced version of KNP,KNP2.0 In March 1997. (c) We made a mouse-based interface to help and to speed up the human correction of tags assigned by JUMAN and KNP.The interface also provides the retrieval functions for the corpus. (d) As of March 1997, we have constructed about 20,000 sentences of parsed and manually-corrected corpus. Out of them, we opened about 10,000 sentences in March 1997.

Report

(3 results)

1996 Annual Research Report Final Research Report Summary
1995 Annual Research Report

Research Products
(13 results)

All Other

All Publications (13 results)

[Publications] 山地治: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会第2回年次大会. 73-76 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 黒橋禎夫: "京都大学におけるテキストコーパスの作成" 情報処理学会「大規模テキストコーパスの作成及び共有の問題点」シンポジウム論文集. 19-26 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 黒橋禎夫: "京都大学テキストコーパス・プロジェクト" 言語処理学会第3回年次大会. 115-118 (1997)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Sadao Kurohashi: "Building a Japanese Parsed Corpus while Improving the Parsing System" ACL/EACL'97 Workshop on Computational Environments for Grammar Development and Linguistic Engineering. (投稿中). (1997)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 黒橋禎夫: "京都大学テキストコーパス・プロジェクト" 人工知能学会第11回全国大会. (発表予定). (1997)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Osamu Yamaji: "Improvements of Japanese Morphological Analyzer JUMAN by Handling Fixed Expressions" Proceedings of the Second Annual Conference of ANLP. 73-76 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Sadao Kurohashi: "Building a Taxt Corpus at Kyoto University" Proceedings of the ISPS Symposium on Building and Sharing Very Large Corpora. 19-26 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Sadao Kurohashi: "Kyoto University Taxt Corpus Project" Proceedings of the Third Annual Conference of ANLP. 115-118 (1997)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Sadao Kurohashi: "Building a Japanese Parsed Corpus while Improving the Parsing System" ACL EACL'97 Workshop on Computational Environments for Grammar Development and Linguistic Engineering. (submitted). (1997)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] Sadao Kurohashi: "Kyoto University Text Corpus Project" Proceedings of the 11th Annual Conference of JSAI. (to be published). (1997)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 黒橋禎男,坂口昌子,長尾眞: "京都大学におけるテキストコーパスの作成" 情報処理学会「大規模テキストコーパスの作成及び共有の問題点」シンポジウム論文集. 19-26 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 黒橋禎男,長尾真: "京都大学テキストコーパス・プロジェクト" 言語処理学会第3回年次大会. (1997)
- Related Report
  1996 Annual Research Report
[Publications] 黒橋禎夫: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会第2回年次大会. (1995)
- Related Report
  1995 Annual Research Report

Building a Japanese Parsed Corpus

Principal Investigator

NAGAO Makoto Kyoto University, Department of Electronics and Communication, Professor, 工学研究科, 教授 (30025960)

¥6,300,000 (Direct Cost: ¥6,300,000)

Report

Research Products

[Publications] 山地 治: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会第2回年次大会. 73-76 (1996)

Description

Related Report

[Publications] 黒橋 禎夫: "京都大学におけるテキストコーパスの作成" 情報処理学会「大規模テキストコーパスの作成及び共有の問題点」シンポジウム論文集. 19-26 (1996)

Description

Related Report

[Publications] 黒橋 禎夫: "京都大学テキストコーパス・プロジェクト" 言語処理学会第3回年次大会. 115-118 (1997)

Description

Related Report

[Publications] Sadao Kurohashi: "Building a Japanese Parsed Corpus while Improving the Parsing System" ACL/EACL'97 Workshop on Computational Environments for Grammar Development and Linguistic Engineering. (投稿中). (1997)

Description

Related Report

[Publications] 黒橋 禎夫: "京都大学テキストコーパス・プロジェクト" 人工知能学会第11回全国大会. (発表予定). (1997)

Description

Related Report

[Publications] Osamu Yamaji: "Improvements of Japanese Morphological Analyzer JUMAN by Handling Fixed Expressions" Proceedings of the Second Annual Conference of ANLP. 73-76 (1996)

Description

Related Report

[Publications] Sadao Kurohashi: "Building a Taxt Corpus at Kyoto University" Proceedings of the ISPS Symposium on Building and Sharing Very Large Corpora. 19-26 (1996)

Description

Related Report

[Publications] Sadao Kurohashi: "Kyoto University Taxt Corpus Project" Proceedings of the Third Annual Conference of ANLP. 115-118 (1997)

Description

Related Report

[Publications] Sadao Kurohashi: "Building a Japanese Parsed Corpus while Improving the Parsing System" ACL EACL'97 Workshop on Computational Environments for Grammar Development and Linguistic Engineering. (submitted). (1997)

Description

Related Report

[Publications] Sadao Kurohashi: "Kyoto University Text Corpus Project" Proceedings of the 11th Annual Conference of JSAI. (to be published). (1997)

Description

Related Report

[Publications] 黒橋禎男,坂口昌子,長尾眞: "京都大学におけるテキストコーパスの作成" 情報処理学会「大規模テキストコーパスの作成及び共有の問題点」シンポジウム論文集. 19-26 (1996)

Related Report

[Publications] 黒橋禎男,長尾真: "京都大学テキストコーパス・プロジェクト" 言語処理学会第3回年次大会. (1997)

Related Report

[Publications] 黒橋 禎夫: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会 第2回年次大会. (1995)

Related Report

[Publications] 山地治: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会第2回年次大会. 73-76 (1996)

[Publications] 黒橋禎夫: "京都大学におけるテキストコーパスの作成" 情報処理学会「大規模テキストコーパスの作成及び共有の問題点」シンポジウム論文集. 19-26 (1996)

[Publications] 黒橋禎夫: "京都大学テキストコーパス・プロジェクト" 言語処理学会第3回年次大会. 115-118 (1997)

[Publications] 黒橋禎夫: "京都大学テキストコーパス・プロジェクト" 人工知能学会第11回全国大会. (発表予定). (1997)

[Publications] 黒橋禎夫: "連語登録による形態素解析システムJUMANの精度向上" 言語処理学会第2回年次大会. (1995)