Theoretically founded algorithms for the automatic production of analogy tests in NLP

研究課題

研究課題/領域番号	21K12038
研究種目	基盤研究(C)
配分区分	基金
応募区分	一般
審査区分	小区分61030:知能情報学関連
研究機関	早稲田大学
研究代表者	LEPAGE YVES 早稲田大学, 理工学術院(情報生産システム研究科・センター), 教授 (70573608)
研究期間 (年度)	2021-04-01 – 2024-03-31
研究課題ステータス	完了 (2023年度)
配分額 *注記	4,030千円 (直接経費: 3,100千円、間接経費: 930千円) 2023年度: 1,040千円 (直接経費: 800千円、間接経費: 240千円) 2022年度: 1,820千円 (直接経費: 1,400千円、間接経費: 420千円) 2021年度: 1,170千円 (直接経費: 900千円、間接経費: 270千円)
キーワード	認知能力 / 類推関係 / 類推関係の徹底的抽出 / 単語埋め込み空間 / 文間類推関係のための神経回路モデル / 実数値間類推関係 / ブール値間類推関係 / 整数値間類推関係 / 自然言語処理 / 単語埋め込み表現 / 推論 / 埋め込み表現 / 類推関係データセット / アルゴリズム / 深層学習
研究開始時の研究の概要	The most important breakthrough in recent Natural Language Processing (NLP) is vector representations of words or parts of sentences. To assess the quality of vector representations of words, analogy test sets are used (France : Paris :: Japan : x => x = Tokyo). Up to now, the production of such data sets is not automatic. This research will study, explore and release theoretically well-founded methods to automatically extract analogy test sets not only between words but also between parts of sentences, and expectedly, for any language.
研究成果の概要	近年の人工知能で、単語や文の意味を数字で表現する。意味が正しく表現されるかを評価するため、類推データセットを用いる。しかし、類推データセットの構築は、今まで自動化されず、人手で英語で構築されたものは日本語に翻訳されても、英語へ偏り、さらに主に特別な種類の類推関係に偏っている。多言語の類推データセットを自動的に構築することで、規則・不規則の単語分解や生成に役に立つを示し、単語間の意味的な新しい類推関係の発見できた。文間類推データセットの構築より、どの文のパターンが類推関係をより多く含まれるかと理解した。言い換えに基づく文間類推データセット構築を提案し、類推関係を理解する神経回路モデルも提案した。
研究成果の学術的意義や社会的意義	人間の性質な認知行動の一つは、類推関係を認識することである。例えば、「男」:「女」::「王」:何？との質問には「妃」の答えは可能だ。また、「この曲は好き。」:「歌ういたい気分だ。」::「このゲームは好き。」:「プレーする気がする。」は文間の例になる。最先端人工知能の単語や文の表現では、どの程度その認知能力を持っているか、それを測るために、類推関係データセットが必要とのなる。本研究では単語間と文間類推データセットの構築を検討した。英語だけでなく、多言語可能な手法、さらにある古典的な類推関係だけでなく（性別、国・首都）、より幅広い手法を提案と検討した。

報告書

(4件)

研究成果
(26件)

すべて 2024 2023 2022 2021 その他

すべて雑誌論文 (7件) (うち国際共著 6件、査読あり 6件、オープンアクセス 3件) 学会発表 (17件) (うち国際学会 11件、招待講演 6件) 備考 (2件)

[雑誌論文] A study of universal morphological analysis using morpheme-based, holistic, and neural approaches under various data size conditions2024
- 著者名/発表者名
  R. Fam and Y. Lepage
- 雑誌名
  
  Annals of Mathematics and Artificial Intelligence
  
  巻: To appear
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] Learning from masked analogies between sentences at multiple levels of formality2023
- 著者名/発表者名
  Wang Liyan、Lepage Yves
- 雑誌名
  
  Annals of Mathematics and Artificial Intelligence
  
  巻: 93 号: 2 ページ: 237-261
- DOI
  10.1007/s10472-023-09918-2
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] A study in the generation of multilingually aligned middle sentences2023
- 著者名/発表者名
  M. Eget, X. Yang, and Y. Lepage
- 雑誌名
  
  Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics
  
  巻: 0 ページ: 45-49
- 関連する報告書
  2022 実施状況報告書
- 査読あり / 国際共著
[雑誌論文] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023
- 著者名/発表者名
  R. Fam and Y. Lepage
- 雑誌名
  
  Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics
  
  巻: 0 ページ: 68-72
- 関連する報告書
  2022 実施状況報告書
- 査読あり / 国際共著
[雑誌論文] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023
- 著者名/発表者名
  L. Wang, Z. Pang, H. Wang, X. Zhao, and Y. Lepage
- 雑誌名
  
  Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics
  
  巻: 0 ページ: 325-330
- 関連する報告書
  2022 実施状況報告書
- 査読あり / 国際共著
[雑誌論文] Organising lexica into analogical grids: a study of a holistic approach for morphological generation under various sizes of data in various languages2022
- 著者名/発表者名
  Fam Rashel、Lepage Yves
- 雑誌名
  
  Journal of Experimental & Theoretical Artificial Intelligence
  
  巻: 0 ページ: 1-26
- 関連する報告書
  2022 実施状況報告書
[雑誌論文] A Study of Analogical Density in Various Corpora at Various Granularity2021
- 著者名/発表者名
  Fam Rashel、Lepage Yves
- 雑誌名
  
  Information
  
  巻: 12 号: 8 ページ: 314-314
- DOI
  10.3390/info12080314
- 関連する報告書
  2021 実施状況報告書
- 査読あり / オープンアクセス / 国際共著
[学会発表] Analogie et moyenne generalisee2024
- 著者名/発表者名
  Y. Lepage and M. Couceiro
- 学会等名
  In Actes de la conference Journees d'intelligence artificielle francaises -- Plateforme francaise d'intelligence artificielle (PFIA-JIAF 2024) (Accepted, to appear)
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Continued pre-training on sentence analogies for translation with small data2024
- 著者名/発表者名
  L. Wang, H. Wang, and Y. Lepage
- 学会等名
  LREC-COLING 2024 (to appear)
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] A study in the generation of multilingually aligned middle sentences2023
- 著者名/発表者名
  M. Eget, X. Yang, and Y. Lepage
- 学会等名
  Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 45--49, April 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023
- 著者名/発表者名
  R. Fam and Y. Lepage
- 学会等名
  Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 68--72, April 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023
- 著者名/発表者名
  L. Wang, Z. Pang, H. Wang, X. Zhao, and Y. Lepage
- 学会等名
  Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 325--330, April 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Resolution of analogies between strings in the case of multiple solutions2023
- 著者名/発表者名
  X. Deng and Y. Lepage
- 学会等名
  In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 3-14, July 2023
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Embedding-to-embedding method based on autoencoder for solving sentence analogies2023
- 著者名/発表者名
  W. Mao and Y. Lepage
- 学会等名
  In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 15-26, July 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Improving sentence embedding with sentence relationships from word analogies2023
- 著者名/発表者名
  Q. Zhang and Y. Lepage
- 学会等名
  In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 43-53, July 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Formulae for the solution of an analogical equation between Booleans using the Sheffer stroke (NAND) or the Pierce arrow (NOR)2023
- 著者名/発表者名
  Y. Lepage
- 学会等名
  Proceedings of the Workshop Interactions between analogies and machine learning, colocated with IJCAI 2023 (IARML@IJCAI 2023), pages 3-14, August 2023.
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] A framework for neural machine translation by fuzzy analogies2023
- 著者名/発表者名
  L. Wang, B. Wloka, and Y. Lepage
- 学会等名
  Proceedings of the Workshop Interactions between analogies and machine learning, colocated with IJCAI 2023 (IARML@IJCAI 2023), pages 47-55, August 2023
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Analogie et donnees de langue (Analogy and language data, in French)2023
- 著者名/発表者名
  Y. Lepage
- 学会等名
  Colloquium LORIA, 15 nov. 2023, LORIA, Nancy, France
- 関連する報告書
  2023 実績報告書
- 招待講演
[学会発表] Analogie, explication des donnees de langue et travaux recents sur representations vectorielles de phrases et analogie2023
- 著者名/発表者名
  Y. Lepage
- 学会等名
  Workshop Analogies: From learning to explainability, 27-28 nov. 2023, Arras, France
- 関連する報告書
  2023 実績報告書
- 招待講演
[学会発表] Analogie et moyenne : considerations generales et application aux chaines (Analogy and means: general considerations and applications to strings, in French)2023
- 著者名/発表者名
  Y. Lepage
- 学会等名
  Forum sciences cognitives et traitement automatique des langues, 29 nov. 2023, Nancy, France
- 関連する報告書
  2023 実績報告書
- 招待講演
[学会発表] Jeux d'analogies pour le TAL (Analogy test sets for NLP, in French)2023
- 著者名/発表者名
  Y. Lepage
- 学会等名
  MALOTEC/LORIA seminar, 13 dec. 2023, Nancy, France
- 関連する報告書
  2023 実績報告書
- 招待講演
[学会発表] Investigating parallelograms inside word embedding space using various analogy test sets in various languages2023
- 著者名/発表者名
  R. Fam and Y. Lepage
- 学会等名
  言語処理学会第29回年次大会発表論文集,、那覇、718--722
- 関連する報告書
  2023 実績報告書
[学会発表] Giving a structure to language data: from analogies to analogical grids.2022
- 著者名/発表者名
  Yves Lepage
- 学会等名
  Invited talk at the seminar of Dublin City University (DCU), 4th of July 2022.
- 関連する報告書
  2022 実施状況報告書
- 招待講演
[学会発表] Analogy on text data2022
- 著者名/発表者名
  Yves Lepage
- 学会等名
  Invited talk at the workshop Interaction between Analogical Reasoning and Machine Learning (IARML 2022), 23rd of July 2022.
- 関連する報告書
  2022 実施状況報告書
- 国際学会 / 招待講演
[備考] Kakenhi Project 21K12038
- URL
  http://lepage-lab.ips.waseda.ac.jp/projects/Kakenhi_Project_21K12038/
- 関連する報告書
  2023 実績報告書
[備考] Kakenhi Kiban C 18K11447
- URL
  http://lepage-lab.ips.waseda.ac.jp/en/projects/kakenhi-kiban-c-18k11447/
- 関連する報告書
  2022 実施状況報告書

Theoretically founded algorithms for the automatic production of analogy tests in NLP

研究代表者

LEPAGE YVES 早稲田大学, 理工学術院(情報生産システム研究科・センター), 教授 (70573608)

4,030千円 (直接経費: 3,100千円、間接経費: 930千円)

報告書

研究成果

[雑誌論文] A study of universal morphological analysis using morpheme-based, holistic, and neural approaches under various data size conditions2024

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Learning from masked analogies between sentences at multiple levels of formality2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] A study in the generation of multilingually aligned middle sentences2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Organising lexica into analogical grids: a study of a holistic approach for morphological generation under various sizes of data in various languages2022

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] A Study of Analogical Density in Various Corpora at Various Granularity2021

著者名/発表者名

雑誌名

DOI

関連する報告書

[学会発表] Analogie et moyenne generalisee2024

著者名/発表者名

学会等名

関連する報告書

[学会発表] Continued pre-training on sentence analogies for translation with small data2024

著者名/発表者名

学会等名

関連する報告書

[学会発表] A study in the generation of multilingually aligned middle sentences2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Resolution of analogies between strings in the case of multiple solutions2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Embedding-to-embedding method based on autoencoder for solving sentence analogies2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Improving sentence embedding with sentence relationships from word analogies2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Formulae for the solution of an analogical equation between Booleans using the Sheffer stroke (NAND) or the Pierce arrow (NOR)2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] A framework for neural machine translation by fuzzy analogies2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Analogie et donnees de langue (Analogy and language data, in French)2023

著者名/発表者名

学会等名

関連する報告書