• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2023 Fiscal Year Final Research Report

Research and development of information extraction methods using word sense disambiguation and domain adaptation

Research Project

  • PDF
Project/Area Number 17KK0002
Research Category

Fund for the Promotion of Joint International Research (Fostering Joint International Research)

Allocation TypeMulti-year Fund
Research Field Intelligent informatics
Research InstitutionTokyo University of Agriculture and Technology (2021-2023)
Ibaraki University (2017-2020)

Principal Investigator

Komiya Kanako  東京農工大学, 工学(系)研究科(研究院), 准教授 (10592339)

Project Period (FY) 2018 – 2023
Keywords問題抽出 / アノテーション / 科学技術論文 / 語義曖昧性解消 / BERT
Outline of Final Research Achievements

We extracted the statements of "problems" (meaning something problematic, not tasks) from Japanese scientific and technical papers. We started by referring to a paper that did the same thing from English scientific and technical papers. However, because the expressions used to describe problems are complex in Japanese, we established annotation rules for the forms of expression and defined them linguistically. Following these rules, we also annotated whether or not the problem statements referred to by 'problem' in a sentence were included in that sentence, and conducted classification experiments using various methods.

Free Research Field

自然言語処理

Academic Significance and Societal Importance of the Research Achievements

英語と比較して、日本語の論文における問題内容の書かれ方について分析を行った。英語論文ではThe problem is X.の書かれ方で書かれている問題内容のみを扱っていたが、日本語では、コピュラ的な表現「Xが問題だ」以外にも修飾的な表現「Xという問題」のような表現が多くみられることが分かった。これらを踏まえて、問題内容のアノテーションルールを策定し、コーパスを作成した。この際に、問題内容は入れ子構造になっていることがあること、問題内容を示すのは、文のことも単語やフレーズのこともあること、指し示す問題内容の粒度にばらつきがあることなどを分析し、ルールに反映した。

URL: 

Published: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi