Knowledge-Base-Grounded Language Models

Research Project

Project/Area Number	21K17814
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Institute of Physical and Chemical Research
Principal Investigator	Heinzerling Benjamin 国立研究開発法人理化学研究所, 革新知能統合研究センター, 副チームリーダー (50846491)
Project Period (FY)	2021-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000) Fiscal Year 2022: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000) Fiscal Year 2021: ¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Keywords	Language models / structured knowledge / interpretability / explainability / knowledge representation / knowledge base / language model / world knowledge / numeric properties / grounding / NLP
Outline of Research at the Start	Just as a human processes text by relating it to her experience and knowledge, this proposal argues that language models (LMs) should relate text to a structured knowledge base (KB). Benefits of the proposed KB-grounded LM are alignment of text and KB, more efficient LM training, and applications in machine translation.
Outline of Final Research Achievements	This research project attained two main achievements. The first achievement is a language model (LM) architecture that enables better integration of structured knowledge. While LMs are commonly trained on large amounts of text, it is often desirable to integrate specific, structured knowledge, such as an proprietary in-house knowledge base or other knowledge that is not covered by the LM's training data. Here, we developed a bi-encoder architecture that enables such an integration without requiring costly retraining. The second achievement is an interpretation method for analyzing how well LMs represent a specific kind of structured knowledge, namely an numeric properties such as a person's year of birth or a city's population.
Academic Significance and Societal Importance of the Research Achievements	The first achievement provides an efficient method for integrating structured knowledge into existing language models, which allows users to adapt LMs to their specific needs without costly retraining. The second achievement improves our understanding of how LMs, thereby increasing transparency.

Report

(4 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report
2021 Research-status Report

Research Products
(11 results)

All 2024 2023 2022

All Journal Article (11 results) (of which Int'l Joint Research: 6 results, Peer Reviewed: 6 results, Open Access: 11 results)

[Journal Article] Monotonic Representation of Numeric Properties in Language Models2024
- Author(s)
  Heinzerling Benjamin, Inui Kentaro
- Journal Title
  
  preprint
  
  Volume: -
- Related Report
  2023 Annual Research Report
- Open Access
[Journal Article] Examining the effect of whitening on static and contextualized word embeddings2023
- Author(s)
  Sasaki Shota、Heinzerling Benjamin、Suzuki Jun、Inui Kentaro
- Journal Title
  
  Information Processing & Management
  
  Volume: 60 Issue: 3 Pages: 103272-103272
- DOI
  10.1016/j.ipm.2023.103272
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}2023
- Author(s)
  Kavumba Pride、Brassard Ana、Heinzerling Benjamin、Inui Kentaro
- Journal Title
  
  : Findings of the Association for Computational Linguistics: EACL 2023
  
  Volume: 1 Pages: 2165-2180
- DOI
  10.18653/v1/2023.findings-eacl.162
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Can LMs Store and Retrieve 1-to-N Relational Knowledge?2023
- Author(s)
  Nagasawa Haruki、Heinzerling Benjamin、Kokuta Kazuma、Inui Kentaro
- Journal Title
  
  Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
  
  Volume: 4 Pages: 130-138
- DOI
  10.18653/v1/2023.acl-srw.22
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Test-time Augmentation for Factual Probing2023
- Author(s)
  Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui
- Journal Title
  
  Findings of the Association for Computational Linguistics: EMNLP 2023
  
  Volume: 0 Pages: 3650-3661
- DOI
  10.18653/v1/2023.findings-emnlp.236
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction2023
- Author(s)
  Qin Dai, Benjamin Heinzerling, Kentaro Inui
- Journal Title
  
  言語処理学会第29回年次大会発表論文集 (2023年3月)
  
  Volume: 1 Pages: 0-0
- Related Report
  2022 Research-status Report
- Open Access
[Journal Article] Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction2022
- Author(s)
  Qin Dai, Benjamin Heinzerling, Kentaro Inui
- Journal Title
  
  Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
  
  Volume: 1 Pages: 0-0
- Related Report
  2022 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] ニューラル言語モデルによる一対多関係知識の記憶と操作2022
- Author(s)
  松本悠太, 吉川将司, Benjamin Heinzerling, 乾健太郎
- Journal Title
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  Volume: 0 Pages: 556-561
- Related Report
  2021 Research-status Report
- Open Access
[Journal Article] ニューラル言語モデルによる一対多関係知識の記憶と操作2022
- Author(s)
  長澤春希, Benjamin Heinzerling, 乾健太郎
- Journal Title
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  Volume: 0 Pages: 1203-1208
- Related Report
  2021 Research-status Report
- Open Access
[Journal Article] Transformer モデルのニューロンには局所的に概念についての知識がエンコードされている2022
- Author(s)
  有山知希, Benjamin Heinzerling, 乾健太郎
- Journal Title
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  Volume: 0 Pages: 599-603
- Related Report
  2021 Research-status Report
- Open Access
[Journal Article] COPA-SSE: Semi-structured Explanations for Commonsense Reasoning2022
- Author(s)
  Ana Brassard, Benjamin Heinzerling, Pride Kavumba and Kentaro Inui
- Journal Title
  
  Proceedings of the 13th Language Resources and Evaluation Conference
  
  Volume: 0
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research

Knowledge-Base-Grounded Language Models

Principal Investigator

Heinzerling Benjamin 国立研究開発法人理化学研究所, 革新知能統合研究センター, 副チームリーダー (50846491)

¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)

Report

Research Products

[Journal Article] Monotonic Representation of Numeric Properties in Language Models2024

Author(s)

Journal Title

Related Report

[Journal Article] Examining the effect of whitening on static and contextualized word embeddings2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Can LMs Store and Retrieve 1-to-N Relational Knowledge?2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Test-time Augmentation for Factual Probing2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction2023

Author(s)

Journal Title

Related Report

[Journal Article] Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction2022

Author(s)

Journal Title

Related Report

[Journal Article] ニューラル言語モデルによる一対多関係知識の記憶と操作2022

Author(s)

Journal Title

Related Report

[Journal Article] ニューラル言語モデルによる一対多関係知識の記憶と操作2022

Author(s)

Journal Title

Related Report

[Journal Article] Transformer モデルのニューロンには 局所的に概念についての知識がエンコードされている2022

Author(s)

Journal Title

Related Report

[Journal Article] COPA-SSE: Semi-structured Explanations for Commonsense Reasoning2022

Author(s)

Journal Title

Related Report

[Journal Article] Transformer モデルのニューロンには局所的に概念についての知識がエンコードされている2022