Project/Area Number |
21K17814
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | Institute of Physical and Chemical Research |
Principal Investigator |
Heinzerling Benjamin 国立研究開発法人理化学研究所, 革新知能統合研究センター, 副チームリーダー (50846491)
|
Project Period (FY) |
2021-04-01 – 2024-03-31
|
Project Status |
Completed (Fiscal Year 2023)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2022: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Fiscal Year 2021: ¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
|
Keywords | Language models / structured knowledge / interpretability / explainability / knowledge representation / knowledge base / language model / world knowledge / numeric properties / grounding / NLP |
Outline of Research at the Start |
Just as a human processes text by relating it to her experience and knowledge, this proposal argues that language models (LMs) should relate text to a structured knowledge base (KB). Benefits of the proposed KB-grounded LM are alignment of text and KB, more efficient LM training, and applications in machine translation.
|
Outline of Final Research Achievements |
This research project attained two main achievements. The first achievement is a language model (LM) architecture that enables better integration of structured knowledge. While LMs are commonly trained on large amounts of text, it is often desirable to integrate specific, structured knowledge, such as an proprietary in-house knowledge base or other knowledge that is not covered by the LM's training data. Here, we developed a bi-encoder architecture that enables such an integration without requiring costly retraining. The second achievement is an interpretation method for analyzing how well LMs represent a specific kind of structured knowledge, namely an numeric properties such as a person's year of birth or a city's population.
|
Academic Significance and Societal Importance of the Research Achievements |
The first achievement provides an efficient method for integrating structured knowledge into existing language models, which allows users to adapt LMs to their specific needs without costly retraining. The second achievement improves our understanding of how LMs, thereby increasing transparency.
|