2004 Fiscal Year Final Research Report Summary
Robust Incremental Parsing based on Finite-State Approximation of Context Free Grammar
Project/Area Number |
15300044
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Nagoya University |
Principal Investigator |
OGAWA Yasuhiro Nagoya University, Graduate School of Engineering, Research Assistant, 大学院・情報科学研究科, 助手 (70332707)
|
Co-Investigator(Kenkyū-buntansha) |
INAGAKI Yasuyoshi Aichi Prefectural University, Faculty of Information Science and Technology, Professor, 情報科学部, 教授 (10023079)
TOYAMA Katsuhiko Nagoya University, Graduate School Engineering, Associate Professor, 大学院・情報科学研究科, 助教授 (70217561)
MATSUBARA Shigeki Nagoya University, Information Technology Center, Associate Professor, 情報連携基盤センター, 助教授 (20303589)
MUHTAR Mahsut Nagoya University, Graduate School of International Development, Research Assistant, 大学院・国際開発研究科, 助手 (20283517)
OHKUBO Hirotaka Aichi Prefectural University, Faculty of Information Science and Technology, Research Assistant, 情報科学部, 助手 (40295580)
|
Project Period (FY) |
2003 – 2004
|
Keywords | natural language processing / parsing / finite state automaton / algorithm / simultaneous interpretation / spoken language / corpus / context free grammar |
Research Abstract |
In this research, it has been aiming at the development of high speed and gradual progress the parsing technology treatable by the same degree of the speed as the voice input to develop the translation technology between several languages that have the simultaneous interpreter function. The approach of developing the approximation conversion technique based on the statistical method that was able to reflect the use frequency etc. of the grammatical rule appropriately by using the language corpus with the syntax tree was selected. The research of the following items was concretely promoted. ^*English, Japanese, and Thai corpus maintenance. The simultaneous interpreter conversation corpus that had been collected in the Nagoya University integration sound information research base was used. In this research, the translation data of it was made by using Japanese data of 24 hours and the English translation with the original data among these about Thai. The syntax tree data was given to each
… More
language corpus in the form of the phrase structure grammar. ^*Acquisition of statistical information from large-scale corpus with syntax tree. The technique to acquire various statistical informations was examined from the language corpus from which the syntax tree was given. It was statistically analyzed by searching for the position in the syntax tree where the grammatical rule appeared, and describing it with the context information. ^*Development of limited automata approximation conversion technique. The technique for converting it from the context-free grammar into limited automata was researched. In conversion. Automata were made by expressing the context-free grammar in the form of the reflexive transition network, and developing them descending. The algorithm that developed the are with high use frequency by priority was developed as a development method according to the probability calculation. ^*Design and mounting of parsing gradual progress system. The system of grammatical acquisition, the limited automata approximation, and parsing was designed, and mounted. Limited automata that consisted of the are of about 50 million were made for the achievement of a practicable analysis. ^*Evaluation for comparison of parsing. The parsing experiment in English, Japanese, and Thai was executed by using the bench mark. The evaluation for comparison of this analytical technique from a diversified viewpoint like accuracy, time, and the number and the form etc. of the syntax tree was executed as a result. The realizability of robust incremental parsing was verified by the automata approximation of the context-free grammar through the research on two years, and the effect on the speed-up of the analysis was able to be confirmed. Less
|
Research Products
(11 results)