2021 年度実施状況報告書

Knowledge-Base-Grounded Language Models

研究課題

研究課題/領域番号	21K17814
研究機関	国立研究開発法人理化学研究所
研究代表者	HEINZERLING BENJAMIN 国立研究開発法人理化学研究所, 革新知能統合研究センター, 特別研究員 (50846491)
研究期間 (年度)	2021-04-01 – 2023-03-31
キーワード	language model / knowledge base / world knowledge / grounding
研究実績の概要	The goal of the first year of this grant was the training and analysis of knowledge-base-grounded language model. While not published yet, a prototype model that combines symbolic information from the knowledge base and textual information has been implemented. The next step is to evaluate this prototype both intrinsically by analyzing its internal representations, and extrinsically via knowledge-intensive downstream applications. Towards the goal of analyzing internal representations of language models, one published case study analyzed the form in which entity and concept knowledge is stored in pretrained language models, finding strong evidence of local storage, i.e., such knowledge is often encoded in only a small number of neurons.
現在までの達成度 (区分)	現在までの達成度 (区分) 3: やや遅れている理由 During development of the knowledge-base grounded language model, which was planned to be fully completed during the first year of the funding period, the need for an intrinsic evaluation method was identified, as this would allow faster evaluation, which, in turn, will speed up development overall. While the originally planned evaluation on knowledge-intensive downstream task remains the main evaluation method, some unforeseen time had to be spent to develop a novel method for the intrinsic evaluation and analysis of world knowledge in language models.
今後の研究の推進方策	Most of the first year of the funding period was spent prototyping and testing different architectures for the knowledge-base grounded language model, as well as developing a novel evaluation method. The plan for the next year of the funding period is as to wrap up both of these started sub-goals, i.e., publish the work on the novel intrinsic evaluation of world knowledge in language models, and then to scale the current prototype model to a full scale pretrained language model.
次年度使用額が生じた理由	As described in the progress report, the training of the the full-scale model, which was originally planned for the first year is now planned for the second year, which in turn means, that contrary to the original plan, costs for using the ABCI cluster have not been incurred during the first year. Furthermore, due to the continuation of the COVID-19 pandemic, no conference travel was undertaken. The amount will be used in order to pay for computing costs on the ABCI cluster and to cover conference travel costs, should they occur.

研究成果
(4件)

すべて 2022

すべて雑誌論文 (4件) (うち国際共著 1件、オープンアクセス 4件、査読あり 1件)

[雑誌論文] ニューラル言語モデルによる一対多関係知識の記憶と操作2022
- 著者名/発表者名
  松本悠太, 吉川将司, Benjamin Heinzerling, 乾健太郎
- 雑誌名
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  巻: 0 ページ: 556-561
- オープンアクセス
[雑誌論文] ニューラル言語モデルによる一対多関係知識の記憶と操作2022
- 著者名/発表者名
  長澤春希, Benjamin Heinzerling, 乾健太郎
- 雑誌名
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  巻: 0 ページ: 1203-1208
- オープンアクセス
[雑誌論文] Transformer モデルのニューロンには局所的に概念についての知識がエンコードされている2022
- 著者名/発表者名
  有山知希, Benjamin Heinzerling, 乾健太郎
- 雑誌名
  
  言語処理学会第28回年次大会発表論文集 (2022年3月)
  
  巻: 0 ページ: 599-603
- オープンアクセス
[雑誌論文] COPA-SSE: Semi-structured Explanations for Commonsense Reasoning2022
- 著者名/発表者名
  Ana Brassard, Benjamin Heinzerling, Pride Kavumba and Kentaro Inui
- 雑誌名
  
  Proceedings of the 13th Language Resources and Evaluation Conference
  
  巻: 0 ページ: (to appear)
- 査読あり / オープンアクセス / 国際共著

2021 年度 実施状況報告書

Knowledge-Base-Grounded Language Models

研究代表者

HEINZERLING BENJAMIN 国立研究開発法人理化学研究所, 革新知能統合研究センター, 特別研究員 (50846491)

現在までの達成度 (区分)

理由

研究成果

[雑誌論文] ニューラル言語モデルによる一対多関係知識の記憶と操作2022

著者名/発表者名

雑誌名

[雑誌論文] ニューラル言語モデルによる一対多関係知識の記憶と操作2022

著者名/発表者名

雑誌名

[雑誌論文] Transformer モデルのニューロンには 局所的に概念についての知識がエンコードされている2022

著者名/発表者名

雑誌名

[雑誌論文] COPA-SSE: Semi-structured Explanations for Commonsense Reasoning2022

著者名/発表者名

雑誌名

2021 年度実施状況報告書

[雑誌論文] Transformer モデルのニューロンには局所的に概念についての知識がエンコードされている2022