Project/Area Number |
12480089
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | NARA INSTITUTE OF SCIENCE AND TECHNOLOGY |
Principal Investigator |
MATSUMOTO Yuji Nara Institute of Science and Technology, Grad School of Informatin Science, professor, 情報科学研究科, 教授 (10211575)
|
Co-Investigator(Kenkyū-buntansha) |
OHTANI Akira Osaka Gakuin University, Faculty of Informatics, lecturer, 情報学部, 講師 (50283817)
MIYAMOTO Edson Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor, 情報科学研究科, 助手 (60335479)
INUI Kentaro Nara Institute of Science and Technology, Grad School of Informatin Science, associate professor, 情報科学研究科, 助教授 (60272689)
MIYATA Takashi Nara Institute of Science and Technology, Grad School of Informatin Science, assistant professor (currently : National Institute of Advanced Industorial Science and Technology researcher), 情報科学研究科(現産業技術総合研究所), 助手(研究員) (00283929)
|
Project Period (FY) |
2000 – 2002
|
Project Status |
Completed (Fiscal Year 2002)
|
Budget Amount *help |
¥9,700,000 (Direct Cost: ¥9,700,000)
Fiscal Year 2002: ¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 2001: ¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2000: ¥3,400,000 (Direct Cost: ¥3,400,000)
|
Keywords | Head-driven Phrase Structure Grammar / Constraint-based Grammar Formalism / Dependency Analysis / Morphological Analysis / Statistical Natural Language Processing / Machine Learning / Support Vector Machines / Integration of Statistical and Constraint Information / 統計的係り受け解析 / 制約に基づく言語処理 / 主辞駆動区構造文法 / 生成語彙 / 主辞駆動句構造文法 / 統計的自然言語処理 / 単一化文法 / 統合処理 / 自然言語処理 / 構文解析 |
Research Abstract |
Along with the increase of machine readable linguistic data, statistical natural language processing has been actively researched. However, most of the statistical natural language processing aims at surface language processing, and is not appropriate to detailed semaintical language analysis. On the other hand, constraint-base grammar formalisms such as Head-driven Phrase Structure Grammar attempt to describe linguistic phenomena as lexical knowledge and most of the linguistic constraints are presented in the lexicon. While such a grammar formalism specifies complicated linguistic information in a very modular way, they have a drawback that any input that violate linguistic constraints cannot be parsed in any way. This research aimed at compensating drawback of both approaches by integrating both mechanisms : We first implemented a rubust and high-quality word-based dependency analysis of sentences using statistical information. Then the constraint-based grammar formalism receiving the output of statistical dependency information, finds out possible interpretation according to the dependency structure. To achieve a robust language processing, we implemented a constraint relaxing mechanism. We implemented the idea of type coersion and co-composition proposed in Generative Lexicon as well as an user interface to browse the intermediate processing information. As for dependency analysis, we utilized Support Vector Machines so as to cope with a large scale feature space, and devised a deterministic bottom-up parsing algorithm for Japanese and English. We implemented a part of Japanese grammar based on Head-driven Phrase Structure Grammar. Those statistical and constraint-based grammar and parser are runnable in the user-inteface we developed to be used for the grammar developpers and the users of the natural language processing system.
|