2002 Fiscal Year Final Research Report Summary

Study on Integration of Statistical Information and Linguistic Constraint Information

Research Project

Project/Area Number	12480089
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	NARA INSTITUTE OF SCIENCE AND TECHNOLOGY
Principal Investigator	MATSUMOTO Yuji Nara Institute of Science and Technology, Grad School of Informatin Science, professor, 情報科学研究科, 教授 (10211575)
Co-Investigator(Kenkyū-buntansha)	OHTANI Akira Osaka Gakuin University, Faculty of Informatics, lecturer, 情報学部, 講師 (50283817) MIYAMOTO Edson Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor, 情報科学研究科, 助手 (60335479) INUI Kentaro Nara Institute of Science and Technology, Grad School of Informatin Science, associate professor, 情報科学研究科, 助教授 (60272689) MIYATA Takashi Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor (currently : National Institute of Advanced Industorial Science and Technology researcher), 情報科学研究科(現産業技術総合研究所), 助手(研究員) (00283929)
Project Period (FY)	2000 – 2002
Keywords	Head-driven Phrase Structure Grammar / Constraint-based Grammar Formalism / Dependency Analysis / Morphological Analysis / Statistical Natural Language Processing / Machine Learning / Support Vector Machines / Integration of Statistical and Constraint Information
Research Abstract	Along with the increase of machine readable linguistic data, statistical natural language processing has been actively researched. However, most of the statistical natural language processing aims at surface language processing, and is not appropriate to detailed semaintical language analysis. On the other hand, constraint-base grammar formalisms such as Head-driven Phrase Structure Grammar attempt to describe linguistic phenomena as lexical knowledge and most of the linguistic constraints are presented in the lexicon. While such a grammar formalism specifies complicated linguistic information in a very modular way, they have a drawback that any input that violate linguistic constraints cannot be parsed in any way. This research aimed at compensating drawback of both approaches by integrating both mechanisms : We first implemented a rubust and high-quality word-based dependency analysis of sentences using statistical information. Then the constraint-based grammar formalism receiving the output of statistical dependency information, finds out possible interpretation according to the dependency structure. To achieve a robust language processing, we implemented a constraint relaxing mechanism. We implemented the idea of type coersion and co-composition proposed in Generative Lexicon as well as an user interface to browse the intermediate processing information. As for dependency analysis, we utilized Support Vector Machines so as to cope with a large scale feature space, and devised a deterministic bottom-up parsing algorithm for Japanese and English. We implemented a part of Japanese grammar based on Head-driven Phrase Structure Grammar. Those statistical and constraint-based grammar and parser are runnable in the user-inteface we developed to be used for the grammar developpers and the users of the natural language processing system.

Research Products
(18 results)

All Other

All Publications (18 results)

[Publications] 大谷朗, 宮田高志, 松本裕治: "HPSGにもとづく日本語文法について-実装に向けての精緻化"自然言語処理. 7(5). 19-49 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 浅原正幸, 松本裕治: "形態素解析のための拡張統計モデル"情報処理学会論文誌. 43(3). 685-695 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 松本裕治: "自然言語処理におけるシステム混合法の利用"電子情報通信学会論文誌. J85-DII. 709-716 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 工藤拓, 松本裕治: "チャンキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43(6). 1834-1842 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Tetsuji Nakagawa, Taku Kudo, Yuji Matsumoto: "Revision Learning and its Application to Part-of-speech Tagging"Proc.40^<th> Annual Meeting of Association for Computational Linguistics. 40. 497-504 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc.6^<th> Conference on Natural Language Learning. 6. 63-69 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Yuji Matsumoto(分担執筆): "Handbook of Computational Linguistics (Chap21:Lexical Knowledge Acquisition)"Oxford University Press. 784 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Tatsuo Yamashita, Yuji Matsumoto: "Language Independent Morphological Analyusis"Proceedings of 6^<th> Applied Natural Language Processing Conference. 232-238 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Models and Tools for High-performance Part-of-speech Tagger"Proceedings of the 18th International Conference on Computational Linguistics. 21-27 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Structure Analysis Based on Support Vector Machines"Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 18-25 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Akira Ohtani, Takashi Miyata, Yuji Matsumoto: "On HPSG-Based Japanese Grammar-Refinement and Extension for Implementation"Journal of Natural Language Processing. Vol.7, No.5. 19-49 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takashi Miyata, Akira Ohtani and Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar far Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 305-316 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Tetsuji Nakagawa, Taku Kudoh and Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 325-331 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Statistical Model for Morphological Analysis"Journal of Information Processing Sciety of Japan. Vol.143, No.3. 685-695 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Yuji Matsumoto: "Usage of System Ensemble Methods in Natural Language Processing"Journal of the Institute of Electronics, Informaton and Commnunication Engineers. Vol.J85-D-II, No.5. 709-716 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis Using Cascaded Chunking"Journal of Information Processing Sciety of Japan. Vol.43, No.6. 1834-1842 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Tetsuji Nakagawa, Taku Kudo and Yuji Matsumoto: "Revision Learning and its Application to Part-of-Speech Tagging"Proceedings of 40th Annual Meeting of Association for Computational Linguistics. 497-504 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proceedings of Sixth Conference on Natural Language Learning. 63-69 (2002)
- Description
  「研究成果報告書概要(欧文)」より

2002 Fiscal Year Final Research Report Summary

Study on Integration of Statistical Information and Linguistic Constraint Information

Principal Investigator

MATSUMOTO Yuji Nara Institute of Science and Technology, Grad School of Informatin Science, professor, 情報科学研究科, 教授 (10211575)

Research Products

[Publications] 大谷朗, 宮田高志, 松本裕治: "HPSGにもとづく日本語文法について-実装に向けての精緻化"自然言語処理. 7(5). 19-49 (2000)

Description

[Publications] 浅原正幸, 松本裕治: "形態素解析のための拡張統計モデル"情報処理学会論文誌. 43(3). 685-695 (2002)

Description

[Publications] 松本裕治: "自然言語処理におけるシステム混合法の利用"電子情報通信学会論文誌. J85-DII. 709-716 (2002)

Description

[Publications] 工藤拓, 松本裕治: "チャンキングの段階適用による日本語係り受け解析"情報処理学会論文誌. 43(6). 1834-1842 (2002)

Description

[Publications] Tetsuji Nakagawa, Taku Kudo, Yuji Matsumoto: "Revision Learning and its Application to Part-of-speech Tagging"Proc.40^<th> Annual Meeting of Association for Computational Linguistics. 40. 497-504 (2002)

Description

[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proc.6^<th> Conference on Natural Language Learning. 6. 63-69 (2002)

Description

[Publications] Yuji Matsumoto(分担執筆): "Handbook of Computational Linguistics (Chap21:Lexical Knowledge Acquisition)"Oxford University Press. 784 (2003)

Description

[Publications] Tatsuo Yamashita, Yuji Matsumoto: "Language Independent Morphological Analyusis"Proceedings of 6^<th> Applied Natural Language Processing Conference. 232-238 (2000)

Description

[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Models and Tools for High-performance Part-of-speech Tagger"Proceedings of the 18th International Conference on Computational Linguistics. 21-27 (2000)

Description

[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Structure Analysis Based on Support Vector Machines"Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 18-25 (2000)

Description

[Publications] Akira Ohtani, Takashi Miyata, Yuji Matsumoto: "On HPSG-Based Japanese Grammar-Refinement and Extension for Implementation"Journal of Natural Language Processing. Vol.7, No.5. 19-49 (2000)

Description

[Publications] Takashi Miyata, Akira Ohtani and Yuji Matsumoto: "An HPSG Account of the Hierarchical Clause Formation in Japanese : HPSG-Based Japanese Grammar far Practical Parsing"Proceedings the 15th Pacific Asia Conference on Language, Information and Computation. 305-316 (2001)

Description

[Publications] Tetsuji Nakagawa, Taku Kudoh and Yuji Matsumoto: "Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines"Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 325-331 (2001)

Description

[Publications] Masayuki Asahara, Yuji Matsumoto: "Extended Statistical Model for Morphological Analysis"Journal of Information Processing Sciety of Japan. Vol.143, No.3. 685-695 (2002)

Description

[Publications] Yuji Matsumoto: "Usage of System Ensemble Methods in Natural Language Processing"Journal of the Institute of Electronics, Informaton and Commnunication Engineers. Vol.J85-D-II, No.5. 709-716 (2002)

Description

[Publications] Taku Kudo, Yuji Matsumoto: "Japanese Dependency Analysis Using Cascaded Chunking"Journal of Information Processing Sciety of Japan. Vol.43, No.6. 1834-1842 (2002)

Description

[Publications] Tetsuji Nakagawa, Taku Kudo and Yuji Matsumoto: "Revision Learning and its Application to Part-of-Speech Tagging"Proceedings of 40th Annual Meeting of Association for Computational Linguistics. 497-504 (2002)

Description

[Publications] Taku Kudo and Yuji Matsumoto: "Japanese Dependency Analysis using Cascaded Chunking"Proceedings of Sixth Conference on Natural Language Learning. 63-69 (2002)

Description