現実世界の逐次的環境変化に協調的に適応するマルチモーダル自然言語理解モデル

Research Project

Project/Area Number	21K21343
Research Category	Fund for the Promotion of Joint International Research (Home-Returning Researcher Development Research)
Allocation Type	Multi-year Fund
Review Section	Informatics
Research Institution	Tohoku University
Principal Investigator	坂口慶祐東北大学, 情報科学研究科, 准教授 (20934087)
Project Period (FY)	2022-02-18 – 2025-03-31
Project Status	Granted (Fiscal Year 2023)
Budget Amount *help	¥57,070,000 (Direct Cost: ¥43,900,000、Indirect Cost: ¥13,170,000)
Keywords	自然言語処理 / 大規模言語モデル / マルチモーダル / 深層学習
Outline of Research at the Start	深層学習による自然言語処理の大きな進展が見られる一方、現実世界のように常に変化する文脈情報が重要なタスクには適応できていない。本研究では、現実世界のように文脈が変化する環境において、言語情報だけでなく、視覚情報、聴覚情報を統合的かつ逐次的に学習するマルチモーダルモデルを提案し実装する。
Outline of Annual Research Achievements	現代社会において、人間とAIが自然にインタラクションや協働を行うためには、ユーザーの文脈を考慮し柔軟に対応できる自然言語処理モデルが不可欠である。このようなモデルは、ユーザーのニーズに応じた対話を可能にし、AIとユーザー間のコミュニケーションをより円滑にすると考えられる。しかし、現時点では「ベンチマーク上での高い精度と、動的な文脈が重要になるアプリケーションでの低い精度とのギャップ」が問題となっている。つまり、AIは一定の文脈でのパフォーマンスは向上しているが、より広範で複雑な状況への対応能力にはまだ限界がある。その解決策として、本研究課題では、現実世界のように常に状況や文脈が変化する環境に対応可能なマルチモーダルモデルを提案している。このモデルは、言語情報だけでなく、視覚情報や聴覚情報を統合的かつ逐次的に学習する能力を持ち、AIがユーザーの現在の状況をより深く理解し、それに基づいた適切な対応を提供することを可能にする。本プロジェクトの2年目では、状況や文脈が変化する環境をベンチマークとしてRealTimeQAを構築し、またマルチモダリティの対象として、画像と抽象的な記号処理を含むダイアグラムの理解に着手した研究が進行中である。それ以外にも、大規模言語モデルの日本語対応や人とAIの創作活動におけるインタラクションに関する研究など、応用の可能性についても成果を上げている。研究成果の発表としては、国際会議での論文採択6件、国内会議14件、国内外の招待講演10件がある。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 2年目は、これまでの取り組みを論文やソフトウェア、データセット、国際会議での発表など、様々な形で成果を発表することができた。 2023年は自然言語処理分野において大規模言語モデルの高性能化やリリースが研究者の予想を超える速度で進んだため、本研究課題の当初のスコープの一部がある程度解決された。これを踏まえ、単純な画像を用いたマルチモダリティだけでなく、抽象的・記号処理的な概念を含むダイアグラム画像の理解など、より挑戦的な課題にも柔軟に対応し、予備実験を開始することができた。また、モデルの動的な文脈への対応力を測るベンチマークとしてRealtimeQAプロジェクトを立ち上げ、国際会議で発表した。具体的には、世界の最新の出来事を半自動で抽出し、大規模言語モデルの性能評価用のフォーマットを自動で構築するものである。これにより、既存モデルの比較評価や新たに登場するモデルへの対応、さらに評価基盤とその結果の継続的な計測が行われている。
Strategy for Future Research Activity	昨年度から引き続き、大規模言語モデルの高性能化やAPIなどのリリースが分野全体で飛躍的に進んでおり、それらの技術を素早く柔軟に活用することが重要であると考えられる。最終年度では、マルチモダリティの中でも特にダイアグラム理解や大規模言語モデルの応用（人間とのインタラクション）に焦点を当てつつ、本研究の知見の社会実装や応用を推進する。

Report

(2 results)

2023 Research-status Report
2022 Research-status Report

Research Products
(35 results)

All 2024 2023 2022 Other

All Int'l Joint Research (3 results) Journal Article (6 results) (of which Int'l Joint Research: 4 results, Peer Reviewed: 6 results, Open Access: 6 results) Presentation (24 results) (of which Int'l Joint Research: 4 results, Invited: 1 results) Remarks (1 results) Funded Workshop (1 results)

[Int'l Joint Research] University of Washington/Allen Institute for AI/Yale University(米国)
- Related Report
  2023 Research-status Report
[Int'l Joint Research] Mohamed bin Zayed University of AI(アラブ首長国連邦)
- Related Report
  2023 Research-status Report
[Int'l Joint Research] University of Washington/Allen Institute for AI/Yale University(米国)
- Related Report
  2022 Research-status Report
[Journal Article] RealTime QA: What's the Answer Right Now?2023
- Author(s)
  Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Velocity Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui
- Journal Title
  
  Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track
  
  Volume: 0 Pages: 0-0
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Test-time Augmentation for Factual Probing2023
- Author(s)
  Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui
- Journal Title
  
  Findings of the Association for Computational Linguistics: EMNLP 2023
  
  Volume: 0 Pages: 3650-3661
- DOI
  10.18653/v1/2023.findings-emnlp.236
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation2023
- Author(s)
  Chandra Bhagavatula, Jena D Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Yejin Choi
- Journal Title
  
  Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  
  Volume: 0 Pages: 9614-9630
- DOI
  10.18653/v1/2023.acl-long.535
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] ELQA: A Corpus of Metalinguistic Questions and Answers about English2023
- Author(s)
  Shabnam Behzad, Keisuke Sakaguchi, Nathan Schneider, Amir Zeldes
- Journal Title
  
  Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  
  Volume: 0 Pages: 2031-2047
- DOI
  10.18653/v1/2023.acl-long.113
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?2023
- Author(s)
  Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui
- Journal Title
  
  Proceedings of the 2023 Conference of the European Chapter of the Association for Computational Linguistics
  
  Volume: 0 Pages: 1343-1354
- DOI
  10.18653/v1/2023.eacl-main.98
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Empirical Investigation of Neural Symbolic Reasoning Strategies2023
- Author(s)
  Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui
- Journal Title
  
  Findings of the Association for Computational Linguistics: EACL 2023
  
  Volume: 0 Pages: 1154-1162
- DOI
  10.18653/v1/2023.findings-eacl.86
- Related Report
  2023 Research-status Report
- Peer Reviewed / Open Access
[Presentation] RLHFを用いた「面白い」短歌の自動生成の試み2024
- Author(s)
  羽根田賢和, 浦川通, 田口雄哉, 田森秀明, 坂口慶祐
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 検出器の判断に基づく大規模言語モデルの生成テキストの特徴分析2024
- Author(s)
  三浦東子, 谷口雅弥, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 言語モデルの思考連鎖的推論における探索戦略の動的変化2024
- Author(s)
  青木洋一, 工藤慧音, 曾根周作, 栗林樹生, 谷口雅弥, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 言語モデルからの知識削除：頻出実体の知識は副作用が破滅的2024
- Author(s)
  高橋良允, 鴨田豪, BenjaminHeinzerling, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 算術推論問題における自己回帰型言語モデルの内部機序2024
- Author(s)
  工藤慧音, 青木洋一, 栗林樹生, 谷口雅弥, 曾根周作, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 日本の司法試験を題材としたGPTモデルの評価2024
- Author(s)
  チェジョンミン, 笠井淳吾, 坂口慶祐
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 自然画像で学習された画像埋め込みにダイアグラムを特徴づける情報は含まれているか？2024
- Author(s)
  吉田遥音, 工藤慧音, 青木洋一, 田中涼太, 斉藤いつみ, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] J-UniMorph: 日本語の形態論における意味分類の体系化2024
- Author(s)
  松崎孝介, 谷口雅弥, 乾健太郎, 坂口慶祐
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 長文生成の多面的評価:人手評価と自動評価の向上を目指して2024
- Author(s)
  鴨田豪, 浅井明里, BrassardAna, 坂口慶祐
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] 大規模言語モデルにおける日本語ゼロ照応解析能力の分析2024
- Author(s)
  野末慎之介, 石月由紀子, 松林優一郎, 坂口慶祐
- Organizer
  言語処理学会第30回年次大会論文集
- Related Report
  2023 Research-status Report
[Presentation] Hagi bot: LLMを用いた対話状態追跡と人間らしい振る舞いで自然な議論を行うマルチモーダル対話システム2023
- Author(s)
  中野雄斗, 野末慎之介, 穀田一真, 有山知希, 佐藤魁, 曾根周作, 亀井遼平, 謝素春, 成田風香, 守屋彰二, 赤間怜奈, 松林優一郎, 坂口慶祐
- Organizer
  人工知能学会研究会資料言語・音声理解と対話処理研究会
- Related Report
  2023 Research-status Report
[Presentation] テキストに基づくダイアグラム生成タスクの提案2023
- Author(s)
  吉田遥音, 工藤慧音, 青木洋一, 坂口慶祐
- Organizer
  NLP若手の会第18回シンポジウム
- Related Report
  2023 Research-status Report
[Presentation] 日本語学習のための形態意味中心の動詞活用2023
- Author(s)
  松崎孝介, 谷口雅弥, 坂口慶祐, 乾健太郎
- Organizer
  NLP若手の会第18回シンポジウム
- Related Report
  2023 Research-status Report
[Presentation] 大規模言語モデルにおける暗黙の推論生成能力の評価2023
- Author(s)
  根岸直生, 坂口慶祐, 乾健太郎
- Organizer
  第22回情報科学技術フォーラム（FIT2023）
- Related Report
  2023 Research-status Report
[Presentation] Empirical Investigation of Neural Symbolic Reasoning Strategies2023
- Author(s)
  Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, and Kentaro Inui
- Organizer
  Findings of the Association for Computational Linguistics: EACL 2023
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?2023
- Author(s)
  Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, and Kentaro Inui
- Organizer
  Proceedings of the 2023 Conference of the European Chapter of the Association for Computational Linguistics
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Test-time Augmentation for Factual Probing2023
- Author(s)
  Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] ニューラル記号推論における推論過程の教示方法2023
- Author(s)
  青木洋一, 工藤慧音, Ana Brassard, 栗林樹生, 吉川将司, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] Towards grammatically-informed feedback comments2023
- Author(s)
  Diana Galvan-Sosa, Steven Coyne, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] 算術問題におけるニューラルモデルの構成的推論能力2023
- Author(s)
  工藤慧音, 青木洋一, 栗林樹生, Ana Brassard, 吉川将司, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] 因果的プロンプトによる NLI の敵対的ロバスト性の強化2023
- Author(s)
  Pride Kavumba, Ana Brassard, Benjamin Heinzerling, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] Developing a Typology for Language Learning Feedback2023
- Author(s)
  Steven Coyne, Diana Galvan-Sosa, 坂口慶祐, 乾健太郎
- Organizer
  言語処理学会第29回年次大会(NLP2023)
- Related Report
  2022 Research-status Report
[Presentation] Twist Decoding: Diverse Generators Guide Each Other2022
- Author(s)
  Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, Noah A. Smith
- Organizer
  Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] Large Language Models: What will happen next?2022
- Author(s)
  Keisuke Sakaguchi
- Organizer
  2022 BrainLink X-Lab Day, Korean Federation of Science and Technology Societies (KOFST)
- Related Report
  2022 Research-status Report
- Int'l Joint Research / Invited
[Remarks] RealTime QA
- URL
  https://realtimeqa.github.io/
- Related Report
  2023 Research-status Report
[Funded Workshop] Hugging Face x TohokuNLP Joint Workshop2023
- Related Report
  2023 Research-status Report

現実世界の逐次的環境変化に協調的に適応するマルチモーダル自然言語理解モデル

Principal Investigator

坂口 慶祐 東北大学, 情報科学研究科, 准教授 (20934087)

¥57,070,000 (Direct Cost: ¥43,900,000、Indirect Cost: ¥13,170,000)

Current Status of Research Progress

Reason

Report

Research Products

[Int'l Joint Research] University of Washington/Allen Institute for AI/Yale University(米国)

Related Report

[Int'l Joint Research] Mohamed bin Zayed University of AI(アラブ首長国連邦)

Related Report

[Int'l Joint Research] University of Washington/Allen Institute for AI/Yale University(米国)

Related Report

[Journal Article] RealTime QA: What's the Answer Right Now?2023

Author(s)

Journal Title

Related Report

[Journal Article] Test-time Augmentation for Factual Probing2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] ELQA: A Corpus of Metalinguistic Questions and Answers about English2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Empirical Investigation of Neural Symbolic Reasoning Strategies2023

Author(s)

Journal Title

DOI

Related Report

[Presentation] RLHFを用いた「面白い」短歌の自動生成の試み2024

Author(s)

Organizer

Related Report

[Presentation] 検出器の判断に基づく大規模言語モデルの生成テキストの特徴分析2024

Author(s)

Organizer

Related Report

[Presentation] 言語モデルの思考連鎖的推論における探索戦略の動的変化2024

Author(s)

Organizer

Related Report

[Presentation] 言語モデルからの知識削除：頻出実体の知識は副作用が破滅的2024

Author(s)

Organizer

Related Report

[Presentation] 算術推論問題における自己回帰型言語モデルの内部機序2024

Author(s)

Organizer

Related Report

[Presentation] 日本の司法試験を題材としたGPTモデルの評価2024

Author(s)

Organizer

Related Report

[Presentation] 自然画像で学習された画像埋め込みにダイアグラムを特徴づける情報は含まれているか？2024

Author(s)

Organizer

Related Report

[Presentation] J-UniMorph: 日本語の形態論における意味分類の体系化2024

Author(s)

Organizer

Related Report

[Presentation] 長文生成の多面的評価:人手評価と自動評価の向上を目指して2024

Author(s)

Organizer

Related Report

[Presentation] 大規模言語モデルにおける日本語ゼロ照応解析能力の分析2024

坂口慶祐東北大学, 情報科学研究科, 准教授 (20934087)