Project/Area Number |
26730126
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Intelligent informatics
|
Research Institution | NTT Communication Science Laboratories |
Principal Investigator |
Hayashi Katsuhiko 日本電信電話株式会社NTTコミュニケーション科学基礎研究所, 協創情報研究部, 研究員 (50725794)
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Project Status |
Completed (Fiscal Year 2016)
|
Budget Amount *help |
¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000)
Fiscal Year 2016: ¥390,000 (Direct Cost: ¥300,000、Indirect Cost: ¥90,000)
Fiscal Year 2015: ¥260,000 (Direct Cost: ¥200,000、Indirect Cost: ¥60,000)
Fiscal Year 2014: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
|
Keywords | 談話構造解析 / 省略補完 / 行列分解 / ハッシュ法 / 低ランク近似 / 分枝限定法 / 線形時間言語解析 / 音声言語データ解析 / 自然言語処理 / 談話解析 / 修辞構造解析 |
Outline of Final Research Achievements |
I investigated hashing and matrix factorization techniques to efficiently analyze large-scale text data in various domains. First, I proposed a fast and accurate parsing algorithm for discourse tree structure analysis of English newswire texts. I also presented a text summarization method using discourse trees, and achieved an improvement in text summarization accuracy. Second, I proposed a method to automatically detect and insert missing elements in English and Japanese speech/newswire texts. Finally, I proposed a knowledge (word thesaurus) embedding method for fast word similarity computation. In future, I will apply these methods to such more advanced NLP applications as machine translation and question answering.
|