• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Theoretically founded algorithms for the automatic production of analogy tests in NLP

Research Project

Project/Area Number 21K12038
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 61030:Intelligent informatics-related
Research InstitutionWaseda University

Principal Investigator

LEPAGE YVES  早稲田大学, 理工学術院(情報生産システム研究科・センター), 教授 (70573608)

Project Period (FY) 2021-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2023: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2022: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2021: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Keywords認知能力 / 類推関係 / 類推関係の徹底的抽出 / 単語埋め込み空間 / 文間類推関係のための神経回路モデル / 実数値間類推関係 / ブール値間類推関係 / 整数値間類推関係 / 自然言語処理 / 単語埋め込み表現 / 推論 / 埋め込み表現 / 類推関係データセット / アルゴリズム / 深層学習
Outline of Research at the Start

The most important breakthrough in recent Natural Language Processing (NLP) is vector representations of words or parts of sentences. To assess the quality of vector representations of words, analogy test sets are used (France : Paris :: Japan : x => x = Tokyo).
Up to now, the production of such data sets is not automatic. This research will study, explore and release theoretically well-founded methods to automatically extract analogy test sets not only between words but also between parts of sentences, and expectedly, for any language.

Outline of Final Research Achievements

Recent artificial intelligence uses numbers to represent the meaning of words or sentences. In order to evaluate whether the meaning is correctly represented, analogy datasets are used. However, the construction of analogy datasets has not been automated until now, and those constructed manually in English are biased toward English, even when translated into Japanese, and biaised toward special types of analogical relations.
By automatically constructing multilingual analogical datasets, we were able to show that it is useful for regular and irregular word analysis and generation, and to discover new semantic analogical relations between words. From the construction of sentence analogy datasets, we understood which sentence patterns contain more analogical relations. We proposed a paraphrase-based sentence analogy dataset construction method, and also proposed neural circuit models for understanding/solving analogical relations.

Academic Significance and Societal Importance of the Research Achievements

人間の性質な認知行動の一つは、類推関係を認識することである。例えば、「男」:「女」::「王」:何?との質問には「妃」の答えは可能だ。また、「この曲は好き。」:「歌ういたい気分だ。」::「このゲームは好き。」:「プレーする気がする。」は文間の例になる。
最先端人工知能の単語や文の表現では、どの程度その認知能力を持っているか、それを測るために、類推関係データセットが必要とのなる。本研究では単語間と文間類推データセットの構築を検討した。英語だけでなく、多言語可能な手法、さらにある古典的な類推関係だけでなく(性別、国・首都)、より幅広い手法を提案と検討した。

Report

(4 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • Research Products

    (26 results)

All 2024 2023 2022 2021 Other

All Journal Article (7 results) (of which Int'l Joint Research: 6 results,  Peer Reviewed: 6 results,  Open Access: 3 results) Presentation (17 results) (of which Int'l Joint Research: 11 results,  Invited: 6 results) Remarks (2 results)

  • [Journal Article] A study of universal morphological analysis using morpheme-based, holistic, and neural approaches under various data size conditions2024

    • Author(s)
      R. Fam and Y. Lepage
    • Journal Title

      Annals of Mathematics and Artificial Intelligence

      Volume: To appear

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Learning from masked analogies between sentences at multiple levels of formality2023

    • Author(s)
      Wang Liyan、Lepage Yves
    • Journal Title

      Annals of Mathematics and Artificial Intelligence

      Volume: 93 Issue: 2 Pages: 237-261

    • DOI

      10.1007/s10472-023-09918-2

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] A study in the generation of multilingually aligned middle sentences2023

    • Author(s)
      M. Eget, X. Yang, and Y. Lepage
    • Journal Title

      Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics

      Volume: 0 Pages: 45-49

    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023

    • Author(s)
      R. Fam and Y. Lepage
    • Journal Title

      Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics

      Volume: 0 Pages: 68-72

    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023

    • Author(s)
      L. Wang, Z. Pang, H. Wang, X. Zhao, and Y. Lepage
    • Journal Title

      Proceedings of the 10th Language & Technology Conference (LTC 2023) & Human Language Technologies as a Challenge for Computer Science and Linguistics

      Volume: 0 Pages: 325-330

    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Organising lexica into analogical grids: a study of a holistic approach for morphological generation under various sizes of data in various languages2022

    • Author(s)
      Fam Rashel、Lepage Yves
    • Journal Title

      Journal of Experimental & Theoretical Artificial Intelligence

      Volume: 0 Pages: 1-26

    • Related Report
      2022 Research-status Report
  • [Journal Article] A Study of Analogical Density in Various Corpora at Various Granularity2021

    • Author(s)
      Fam Rashel、Lepage Yves
    • Journal Title

      Information

      Volume: 12 Issue: 8 Pages: 314-314

    • DOI

      10.3390/info12080314

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Analogie et moyenne generalisee2024

    • Author(s)
      Y. Lepage and M. Couceiro
    • Organizer
      In Actes de la conference Journees d'intelligence artificielle francaises -- Plateforme francaise d'intelligence artificielle (PFIA-JIAF 2024) (Accepted, to appear)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Continued pre-training on sentence analogies for translation with small data2024

    • Author(s)
      L. Wang, H. Wang, and Y. Lepage
    • Organizer
      LREC-COLING 2024 (to appear)
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A study in the generation of multilingually aligned middle sentences2023

    • Author(s)
      M. Eget, X. Yang, and Y. Lepage
    • Organizer
      Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 45--49, April 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Investigating parallelograms: Assessing several word embedding spaces against various analogy test sets in several languages using approximation2023

    • Author(s)
      R. Fam and Y. Lepage
    • Organizer
      Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 68--72, April 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Solving sentence analogies by using embedding spaces combined with a vector-to-sequence decoder or by fine-tuning pre-trained language models2023

    • Author(s)
      L. Wang, Z. Pang, H. Wang, X. Zhao, and Y. Lepage
    • Organizer
      Proceedings of the 10th Language & Technology Conference (LTC 2023) -- Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 325--330, April 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Resolution of analogies between strings in the case of multiple solutions2023

    • Author(s)
      X. Deng and Y. Lepage
    • Organizer
      In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 3-14, July 2023
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Embedding-to-embedding method based on autoencoder for solving sentence analogies2023

    • Author(s)
      W. Mao and Y. Lepage
    • Organizer
      In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 15-26, July 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Improving sentence embedding with sentence relationships from word analogies2023

    • Author(s)
      Q. Zhang and Y. Lepage
    • Organizer
      In CEUR, editor, Proceedings of ICCBR: Workshop on Analogies: from Theory to Applications (ATA@ICCBR 2023), CEUR Workshop Proceedings, pages 43-53, July 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Formulae for the solution of an analogical equation between Booleans using the Sheffer stroke (NAND) or the Pierce arrow (NOR)2023

    • Author(s)
      Y. Lepage
    • Organizer
      Proceedings of the Workshop Interactions between analogies and machine learning, colocated with IJCAI 2023 (IARML@IJCAI 2023), pages 3-14, August 2023.
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A framework for neural machine translation by fuzzy analogies2023

    • Author(s)
      L. Wang, B. Wloka, and Y. Lepage
    • Organizer
      Proceedings of the Workshop Interactions between analogies and machine learning, colocated with IJCAI 2023 (IARML@IJCAI 2023), pages 47-55, August 2023
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Analogie et donnees de langue (Analogy and language data, in French)2023

    • Author(s)
      Y. Lepage
    • Organizer
      Colloquium LORIA, 15 nov. 2023, LORIA, Nancy, France
    • Related Report
      2023 Annual Research Report
    • Invited
  • [Presentation] Analogie, explication des donnees de langue et travaux recents sur representations vectorielles de phrases et analogie2023

    • Author(s)
      Y. Lepage
    • Organizer
      Workshop Analogies: From learning to explainability, 27-28 nov. 2023, Arras, France
    • Related Report
      2023 Annual Research Report
    • Invited
  • [Presentation] Analogie et moyenne : considerations generales et application aux chaines (Analogy and means: general considerations and applications to strings, in French)2023

    • Author(s)
      Y. Lepage
    • Organizer
      Forum sciences cognitives et traitement automatique des langues, 29 nov. 2023, Nancy, France
    • Related Report
      2023 Annual Research Report
    • Invited
  • [Presentation] Jeux d'analogies pour le TAL (Analogy test sets for NLP, in French)2023

    • Author(s)
      Y. Lepage
    • Organizer
      MALOTEC/LORIA seminar, 13 dec. 2023, Nancy, France
    • Related Report
      2023 Annual Research Report
    • Invited
  • [Presentation] Investigating parallelograms inside word embedding space using various analogy test sets in various languages2023

    • Author(s)
      R. Fam and Y. Lepage
    • Organizer
      言語処理学会第29回年次大会発表論文集,、那覇、718--722
    • Related Report
      2023 Annual Research Report
  • [Presentation] Giving a structure to language data: from analogies to analogical grids.2022

    • Author(s)
      Yves Lepage
    • Organizer
      Invited talk at the seminar of Dublin City University (DCU), 4th of July 2022.
    • Related Report
      2022 Research-status Report
    • Invited
  • [Presentation] Analogy on text data2022

    • Author(s)
      Yves Lepage
    • Organizer
      Invited talk at the workshop Interaction between Analogical Reasoning and Machine Learning (IARML 2022), 23rd of July 2022.
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research / Invited
  • [Remarks] Kakenhi Project 21K12038

    • URL

      http://lepage-lab.ips.waseda.ac.jp/projects/Kakenhi_Project_21K12038/

    • Related Report
      2023 Annual Research Report
  • [Remarks] Kakenhi Kiban C 18K11447

    • URL

      http://lepage-lab.ips.waseda.ac.jp/en/projects/kakenhi-kiban-c-18k11447/

    • Related Report
      2022 Research-status Report

URL: 

Published: 2021-04-28   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi