2017 Fiscal Year Final Research Report
Studies on robust statistical parsing across different domains using word embeddings
Project/Area Number |
16H06981
|
Research Category |
Grant-in-Aid for Research Activity Start-up
|
Allocation Type | Single-year Grants |
Research Field |
Intelligent informatics
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
Noji Hiroshi 奈良先端科学技術大学院大学, 情報科学研究科, 助教 (00782541)
|
Project Period (FY) |
2016-08-26 – 2018-03-31
|
Keywords | 構文解析 / 組み合わせ範疇文法 / ドメイン適応 |
Outline of Final Research Achievements |
A problem in statistical natural language processing based on machine learning is that a system performs poorly on texts, which come from a different domain than the one of the training data. Since most systems, such as parsers, are trained with annotated data in the newspaper domain, their performance significantly drops on other kinds of texts, e.g., web and scientific papers. Toward more robust parsing method across different domains, we first developed a new simple parser based on Combinatory Categorical Grammar (CCG), which has an advantage that it does not require preprocessing including POS tagging. We also designed a new neural network architecture for parser domain adaptation, and verified the effectiveness of the approach.
|
Free Research Field |
計算言語学
|