2022 Fiscal Year Final Research Report

A Study of Specializing Natural Language Processing Models for Target Texts

Research Project

PDF

Project/Area Number	19K20351
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Nara Institute of Science and Technology (2021-2022) Institute of Physical and Chemical Research (2019-2020)
Principal Investigator	Ouchi Hiroki 奈良先端科学技術大学院大学, 先端科学技術研究科, 助教 (70825463)
Project Period (FY)	2019-04-01 – 2023-03-31
Keywords	事例ベース学習 / 表現学習 / 構造予測
Outline of Final Research Achievements	We had two objectives in this research. The first was to develop a method to specialize distributed representations to target texts and to test its effectiveness. This was successfully accomplished in 2019. From 2020 onward, we worked on the second objective; developing methods to learn instances with the same label so that they are located near each other in the feature vector space and verifying the effectiveness. By applying distance learning to a deep neural network that maps each instance to a feature vector space, we achieved learning so that instances with the same label are close to each other in the feature vector space. As a result, test instances could be classified based on their similarity to the training instances.
Free Research Field	自然言語処理
Academic Significance and Societal Importance of the Research Achievements	一つ目の研究目的遂行によって、目標テキストが所与の場合はそのテキストに単語分散表現(言語モデル)を特化させることが効果的であることを示された。実応用の文脈で言い換えると、解析したい(目標)テキスト集合を手元に保有している一般企業やユーザーは、本提案手法のように目標テキストにモデルを特化させることによってより効果的に解析可能であることが示唆された。二つ目の研究目的遂行によって、従来の深層ニューラルネットが抱える解釈性の問題への緩和策を提示した。例えば、「この学習事例と類似しているため、このテスト事例はこのように分類します」といったように、根拠を提示しながら予測を行えるようになった。