Study on Integration of Statistical Information and Linguistic Constraint Information

Research Project

Project/Area Number	12480089
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	NARA INSTITUTE OF SCIENCE AND TECHNOLOGY
Principal Investigator	MATSUMOTO Yuji Nara Institute of Science and Technology, Grad School of Informatin Science, professor, 情報科学研究科, 教授 (10211575)
Co-Investigator(Kenkyū-buntansha)	OHTANI Akira Osaka Gakuin University, Faculty of Informatics, lecturer, 情報学部, 講師 (50283817) MIYAMOTO Edson Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor, 情報科学研究科, 助手 (60335479) INUI Kentaro Nara Institute of Science and Technology, Grad School of Informatin Science, associate professor, 情報科学研究科, 助教授 (60272689) MIYATA Takashi Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor (currently : National Institute of Advanced Industorial Science and Technology researcher), 情報科学研究科(現産業技術総合研究所), 助手(研究員) (00283929)
Project Period (FY)	2000 – 2002
Project Status	Completed (Fiscal Year 2002)
Budget Amount *help	¥9,700,000 (Direct Cost: ¥9,700,000) Fiscal Year 2002: ¥3,100,000 (Direct Cost: ¥3,100,000) Fiscal Year 2001: ¥3,200,000 (Direct Cost: ¥3,200,000) Fiscal Year 2000: ¥3,400,000 (Direct Cost: ¥3,400,000)
Keywords	Head-driven Phrase Structure Grammar / Constraint-based Grammar Formalism / Dependency Analysis / Morphological Analysis / Statistical Natural Language Processing / Machine Learning / Support Vector Machines / Integration of Statistical and Constraint Information / 統計的係り受け解析 / 制約に基づく言語処理 / 主辞駆動区構造文法 / 生成語彙 / 主辞駆動句構造文法 / 統計的自然言語処理 / 単一化文法 / 統合処理 / 自然言語処理 / 構文解析
Research Abstract	Along with the increase of machine readable linguistic data, statistical natural language processing has been actively researched. However, most of the statistical natural language processing aims at surface language processing, and is not appropriate to detailed semaintical language analysis. On the other hand, constraint-base grammar formalisms such as Head-driven Phrase Structure Grammar attempt to describe linguistic phenomena as lexical knowledge and most of the linguistic constraints are presented in the lexicon. While such a grammar formalism specifies complicated linguistic information in a very modular way, they have a drawback that any input that violate linguistic constraints cannot be parsed in any way. This research aimed at compensating drawback of both approaches by integrating both mechanisms : We first implemented a rubust and high-quality word-based dependency analysis of sentences using statistical information. Then the constraint-based grammar formalism receiving the output of statistical dependency information, finds out possible interpretation according to the dependency structure. To achieve a robust language processing, we implemented a constraint relaxing mechanism. We implemented the idea of type coersion and co-composition proposed in Generative Lexicon as well as an user interface to browse the intermediate processing information. As for dependency analysis, we utilized Support Vector Machines so as to cope with a large scale feature space, and devised a deterministic bottom-up parsing algorithm for Japanese and English. We implemented a part of Japanese grammar based on Head-driven Phrase Structure Grammar. Those statistical and constraint-based grammar and parser are runnable in the user-inteface we developed to be used for the grammar developpers and the users of the natural language processing system.

Report

(4 results)

2002 Annual Research Report Final Research Report Summary
2001 Annual Research Report
2000 Annual Research Report

Research Products
(36 results)

All Other

All Publications (36 results)

[Publications] 大谷朗, 宮田高志, 松本裕治: "HPSGにもとづく日本語文法について-実装に向けての精緻化"自然言語処理. 7(5). 19-49 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 浅原正幸, 松本裕治: "形態素解析のための拡張統計モデル"情報処理学会論文誌. 43(3). 685-695 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 松本裕治: "自然言語処理におけるシステム混合法の利用"電子情報通信学会論文誌. J85-DII. 709-716 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 工藤拓, 松本裕治: "チャンキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43(6). 1834-1842 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Tetsuji Nakagawa, Taku Kudo, Yuji Matsumoto: "Revision Learning and its Application to Part-of-speech Tagging"Proc.40^<th> Annual Meeting of Association for Computational Linguistics. 40. 497-504 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc.6^<th> Conference on Natural Language Learning. 6. 63-69 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Yuji Matsumoto(分担執筆): "Handbook of Computational Linguistics (Chap21:Lexical Knowledge Acquisition)"Oxford University Press. 784 (2003)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Tatsuo Yamashita, Yuji Matsumoto: "Language Independent Morphological Analyusis"Proceedings of 6^<th> Applied Natural Language Processing Conference. 232-238 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Models and Tools for High-performance Part-of-speech Tagger"Proceedings of the 18th International Conference on Computational Linguistics. 21-27 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Structure Analysis Based on Support Vector Machines"Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 18-25 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Akira Ohtani, Takashi Miyata, Yuji Matsumoto: "On HPSG-Based Japanese Grammar-Refinement and Extension for Implementation"Journal of Natural Language Processing. Vol.7, No.5. 19-49 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Takashi Miyata, Akira Ohtani and Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar far Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 305-316 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Tetsuji Nakagawa, Taku Kudoh and Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 325-331 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Statistical Model for Morphological Analysis"Journal of Information Processing Sciety of Japan. Vol.143, No.3. 685-695 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Yuji Matsumoto: "Usage of System Ensemble Methods in Natural Language Processing"Journal of the Institute of Electronics, Informaton and Commnunication Engineers. Vol.J85-D-II, No.5. 709-716 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis Using Cascaded Chunking"Journal of Information Processing Sciety of Japan. Vol.43, No.6. 1834-1842 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Tetsuji Nakagawa, Taku Kudo and Yuji Matsumoto: "Revision Learning and its Application to Part-of-Speech Tagging"Proceedings of 40th Annual Meeting of Association for Computational Linguistics. 497-504 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proceedings of Sixth Conference on Natural Language Learning. 63-69 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 工藤拓, 松本裕治: "チャッキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43・6. 1834-1842 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc. 6^<th> Conference on Natural Language Learning. CoNLL02. 63-69 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Edson Miyamoto: "Case markers as clause boundary inducers in Japanese"Journal of Psycholinguistic Research. 31・4. 307-346 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 大谷朗, 松本裕治: "NAIST JPSGにおける授受構文の形式化"日本認知科学会第19回大会論文集. 19. 86-87 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Ryu Iida, Kentaro Inui, Hiroya Takamura, Yuji Matsumoto: "Incorporating Contextual Clues in Trainable Models for Coreference Resolution"EACL2003 Workshop on The Computational Treatment of Anaphora. (to appear). (2003)
- Related Report
  2002 Annual Research Report
[Publications] 森本芳弘, 松本裕治: "HPSGの単一化の機能拡張と実行過程の追跡システムの実装"言語処理学会第9回年次大会発表論文集. 9. 429-432 (2003)
- Related Report
  2002 Annual Research Report
[Publications] Takashi Miyata, Akita Ohtani, Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar for Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 15. 305-316 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Taku Kudo, Yuji Matsumoto: "Chunking with Support Vector Machines"Proceedings of the Second Meeting of North American Chapter of Association for Computational Linguistics. 2. 192-199 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Tetsuji Nakagawa, Taku Kudoh, Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 6. 325-331 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 松本裕治, 伝康晴: "話し言葉の形態素解析"情報処理学会研究報告. NL-143. 49-54 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 松本裕治: "HPSGの実装と拡張について"日本英語学会第19回大会シンポジウム Conference Handbook. 19. 198-203 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 宮田高志, 大谷朗: "素性に基づく文法のための辞書記述ツール"情報処理学会研究報告. NL-146. 67-73 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 大谷朗,宮田高志,松本裕治: "HPSGに基づく日本語の格助詞に関する-考察"日本認知科学会第17回大会発表論文集. 17. 136-137 (2000)
- Related Report
  2000 Annual Research Report
[Publications] Taku Kudoh and Yuji Matsumoto: "Japanese Dependency Structure Analysis Based on Support Vector Machines"Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 5. 18-25 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 大谷朗,宮田高志,松本裕治: "HPSGにもとづく日本語文法について-実装に向けての精緻化・拡張-"自然言語処理. 7(5). 19-49 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 大谷朗,宮田高志,松本裕治: "述語の隣接と述部の付加における意味的階層性"情報処理学会自然言語処理研究会研究報告. 2000-NL-140. 103-110 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 宮田高志,山本薫,松本裕治: "Support Vector Machine による英語係り受け解析"情報処理学会自然言語処理研究会研究報告. 2000-NL-140. 135-142 (2000)
- Related Report
  2000 Annual Research Report
[Publications] Takashi Miyata,Akira Ohtani,Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar for Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language,Information and Computation. 15. 305-316 (2001)
- Related Report
  2000 Annual Research Report

Study on Integration of Statistical Information and Linguistic Constraint Information

Principal Investigator

MATSUMOTO Yuji Nara Institute of Science and Technology, Grad School of Informatin Science, professor, 情報科学研究科, 教授 (10211575)

¥9,700,000 (Direct Cost: ¥9,700,000)

Report

Research Products

[Publications] 大谷朗, 宮田高志, 松本裕治: "HPSGにもとづく日本語文法について-実装に向けての精緻化"自然言語処理. 7(5). 19-49 (2000)

Description

Related Report

[Publications] 浅原正幸, 松本裕治: "形態素解析のための拡張統計モデル"情報処理学会論文誌. 43(3). 685-695 (2002)

Description

Related Report

[Publications] 松本裕治: "自然言語処理におけるシステム混合法の利用"電子情報通信学会論文誌. J85-DII. 709-716 (2002)

Description

Related Report

[Publications] 工藤拓, 松本裕治: "チャンキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43(6). 1834-1842 (2002)

Description

Related Report

[Publications] Tetsuji Nakagawa, Taku Kudo, Yuji Matsumoto: "Revision Learning and its Application to Part-of-speech Tagging"Proc.40^<th> Annual Meeting of Association for Computational Linguistics. 40. 497-504 (2002)

Description

Related Report

[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc.6^<th> Conference on Natural Language Learning. 6. 63-69 (2002)

Description

Related Report

[Publications] Yuji Matsumoto(分担執筆): "Handbook of Computational Linguistics (Chap21:Lexical Knowledge Acquisition)"Oxford University Press. 784 (2003)

Description

Related Report

[Publications] Tatsuo Yamashita, Yuji Matsumoto: "Language Independent Morphological Analyusis"Proceedings of 6^<th> Applied Natural Language Processing Conference. 232-238 (2000)

Description

Related Report

[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Models and Tools for High-performance Part-of-speech Tagger"Proceedings of the 18th International Conference on Computational Linguistics. 21-27 (2000)

Description

Related Report

[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Structure Analysis Based on Support Vector Machines"Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 18-25 (2000)

Description

Related Report

[Publications] Akira Ohtani, Takashi Miyata, Yuji Matsumoto: "On HPSG-Based Japanese Grammar-Refinement and Extension for Implementation"Journal of Natural Language Processing. Vol.7, No.5. 19-49 (2000)

Description

Related Report

[Publications] Takashi Miyata, Akira Ohtani and Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar far Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 305-316 (2001)

Description

Related Report

[Publications] Tetsuji Nakagawa, Taku Kudoh and Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 325-331 (2001)

Description

Related Report

[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Statistical Model for Morphological Analysis"Journal of Information Processing Sciety of Japan. Vol.143, No.3. 685-695 (2002)

Description

Related Report

[Publications] Yuji Matsumoto: "Usage of System Ensemble Methods in Natural Language Processing"Journal of the Institute of Electronics, Informaton and Commnunication Engineers. Vol.J85-D-II, No.5. 709-716 (2002)

Description

Related Report

[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis Using Cascaded Chunking"Journal of Information Processing Sciety of Japan. Vol.43, No.6. 1834-1842 (2002)

Description

Related Report

[Publications] Tetsuji Nakagawa, Taku Kudo and Yuji Matsumoto: "Revision Learning and its Application to Part-of-Speech Tagging"Proceedings of 40th Annual Meeting of Association for Computational Linguistics. 497-504 (2002)

Description

Related Report

[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proceedings of Sixth Conference on Natural Language Learning. 63-69 (2002)

Description

Related Report

[Publications] 工藤拓, 松本裕治: "チャッキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43・6. 1834-1842 (2002)

Related Report

[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc. 6^<th> Conference on Natural Language Learning. CoNLL02. 63-69 (2002)

Related Report

[Publications] Edson Miyamoto: "Case markers as clause boundary inducers in Japanese"Journal of Psycholinguistic Research. 31・4. 307-346 (2002)

Related Report

[Publications] 大谷朗, 松本裕治: "NAIST JPSGにおける授受構文の形式化"日本認知科学会第19回大会論文集. 19. 86-87 (2002)

Related Report

[Publications] Ryu Iida, Kentaro Inui, Hiroya Takamura, Yuji Matsumoto: "Incorporating Contextual Clues in Trainable Models for Coreference Resolution"EACL2003 Workshop on The Computational Treatment of Anaphora. (to appear). (2003)

Related Report

[Publications] 森本芳弘, 松本裕治: "HPSGの単一化の機能拡張と実行過程の追跡システムの実装"言語処理学会第9回年次大会発表論文集. 9. 429-432 (2003)

Related Report

[Publications] Takashi Miyata, Akita Ohtani, Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar for Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 15. 305-316 (2001)

Related Report

[Publications] Taku Kudo, Yuji Matsumoto: "Chunking with Support Vector Machines"Proceedings of the Second Meeting of North American Chapter of Association for Computational Linguistics. 2. 192-199 (2001)

Related Report

[Publications] Tetsuji Nakagawa, Taku Kudoh, Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 6. 325-331 (2001)

Related Report

[Publications] 松本 裕治, 伝 康晴: "話し言葉の形態素解析"情報処理学会研究報告. NL-143. 49-54 (2001)

Related Report

[Publications] 松本裕治, 伝康晴: "話し言葉の形態素解析"情報処理学会研究報告. NL-143. 49-54 (2001)

[Publications] 松本裕治: "HPSGの実装と拡張について"日本英語学会第19回大会シンポジウム Conference Handbook. 19. 198-203 (2001)

[Publications] 宮田高志, 大谷朗: "素性に基づく文法のための辞書記述ツール"情報処理学会研究報告. NL-146. 67-73 (2001)