• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Compositionality and Interpretation of Word Embeddings

Research Project

Project/Area Number 19K12099
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 61030:Intelligent informatics-related
Research InstitutionTokyo Metropolitan University

Principal Investigator

Komachi Mamoru  東京都立大学, システムデザイン研究科, 教授 (60581329)

Project Period (FY) 2019-04-01 – 2022-03-31
Project Status Completed (Fiscal Year 2021)
Budget Amount *help
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2021: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2020: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2019: ¥2,990,000 (Direct Cost: ¥2,300,000、Indirect Cost: ¥690,000)
Keywords単語分散表現 / 構成性 / 機械翻訳 / 文法誤り訂正 / 意味変化 / 深層学習 / 自然言語処理 / 文法誤り検出 / 機械学習 / 分散表現
Outline of Research at the Start

本研究は、自然言語処理における単語分散表現の学習において、意味の構成性がどのように実現されていて文の意味表現を計算できるのかについて、情報理論的観点から研究する。意味を構成する最小の単位は形態素と言われているが、文の意味の計算に必要な構成要素が何であるかは明らかではない。そこで、本研究は形態素より小さい単位で意味を構成する要素の探求と、それらを用いて文の意味を計算する技術の確立を目指す。

Outline of Final Research Achievements

In this research, we studied methods for composing distributed representation of words from smaller units in word representation learning in natural language processing. Specifically, focusing on machine translation, we explored the optimal granularity of input for learning distributed representation of words in Japanese-Chinese translation. We also clarified what kind of knowledge is transferable across languages such as Japanese, English, German, and Russian for grammatical error correction. In addition, we addressed the interpretation of word representations, and proposed a highly interpretable method for learning word representations to capture diachronic semantic change, employing an approach with an information-theoretic background.

Academic Significance and Societal Importance of the Research Achievements

本研究の成果は、日本語や中国語のような表意文字を用いる言語は、文字よりも細かい単位で意味を捉える方が適切であるという可能性を示唆している点にあります。世界的には英語に代表されるような少数のアルファベットを用いる言語が広く研究されていますが、そのような言語で提案されている手法が日本語や中国語では必ずしも最適な手法ではない、ということを意味します。深層学習の登場により多言語を同時に扱うことのできる手法がさまざま提案されていますが、それぞれの言語の特徴も考慮することの重要性を改めて示しています。

Report

(4 results)
  • 2021 Annual Research Report   Final Research Report ( PDF )
  • 2020 Research-status Report
  • 2019 Research-status Report
  • Research Products

    (25 results)

All 2022 2021 2020 2019 Other

All Int'l Joint Research (2 results) Journal Article (6 results) (of which Peer Reviewed: 6 results,  Open Access: 6 results) Presentation (17 results) (of which Int'l Joint Research: 17 results)

  • [Int'l Joint Research] IT University of Copenhagen/University of Groningen(デンマーク)

    • Related Report
      2020 Research-status Report
  • [Int'l Joint Research] リバプール大学(英国)

    • Related Report
      2019 Research-status Report
  • [Journal Article] 言語間での転移学習のための事前学習モデルと多言語の学習者データを用いた文法誤り訂正2022

    • Author(s)
      山下郁海, 金子正弘, 三田雅人, 勝又智, Imankulova Aizhan, 小町守
    • Journal Title

      自然言語処理

      Volume: 29

    • Related Report
      2021 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Using Sub-character Level Information for Neural Machine Translation of Logographic Languages2021

    • Author(s)
      Zhang Longtu and Komachi Mamoru
    • Journal Title

      ACM Transactions on Asian and Low-Resource Language Information Processing

      Volume: 20 Issue: 2 Pages: 1-15

    • DOI

      10.1145/3431727

    • Related Report
      2021 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] 文法誤り訂正の参照文を用いない自動評価への最適化2021

    • Author(s)
      吉村綾馬, 金子正弘, 梶原智之, 小町守
    • Journal Title

      自然言語処理

      Volume: 28

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Using Sub-Character Level Information for Neural Machine Translation of Logographic Languages2021

    • Author(s)
      Longtu Zhang and Mamoru Komachi
    • Journal Title

      ACM Transaction on Asian and Low-Resource Language Information Processing

      Volume: -

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection2019

    • Author(s)
      Masahiro Kaneko and Mamoru Komachi
    • Journal Title

      Computacion y Sistemas

      Volume: 23 Pages: 883-391

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 事前学習された文の分散表現を用いた機械翻訳の自動評価2019

    • Author(s)
      嶋中宏希, 梶原智之, 小町守
    • Journal Title

      自然言語処理

      Volume: 26 Pages: 613-634

    • NAID

      130007761392

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] Analyzing Semantic Changes in Japanese Words Using BERT2021

    • Author(s)
      Kazuma Kobayashi, Taichi Aida and Mamoru Komachi
    • Organizer
      35th Pacific Asia Conference on Language, Information and Computation (PACLIC 2021)
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A Comprehensive Analysis of PMI-based Models for Measuring Semantic Differences2021

    • Author(s)
      Taichi Aida, Mamoru Komachi, Toshinobu Ogiso, Hiroya Takamura, Daichi Mochihashi
    • Organizer
      35th Pacific Asia Conference on Language, Information and Computation (PACLIC 2021)
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] From Masked-Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding2021

    • Author(s)
      Rob van der Goot (IT University of Copenhagen), Marija Stepanovic (IT University of Copenhagen), Alan Ramponi (IT University of Copenhagen), Ibrahim Sharaf, Ahmet Ustun (University of Groningen), Aizhan Imankulova, Siti Oryza Khairunnisa, Mamoru Komachi and Barbara Plank (IT University of Copenhagen)
    • Organizer
      2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2021)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] SOME: Reference-less Sub-Metrics Optimized for Manual Evaluations of Grammatical Error Correction2020

    • Author(s)
      Ryoma Yoshimura, Masahiro Kaneko, Tomoyuki Kajiwara (Osaka University) and Mamoru Komachi
    • Organizer
      8th International Conference on Computational Linguistics (COLING)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Cross-lingual Transfer Learning for Grammatical Error Correction2020

    • Author(s)
      Ikumi Yamashita, Satoru Katsumata, Masahiro Kaneko, Aizhan Imankulova and Mamoru Komachi
    • Organizer
      28th International Conference on Computational Linguistics (COLING)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Chinese Grammatical Correction Using BERT-based Pre-trained Model2020

    • Author(s)
      Hongfei Wang, Michiki Kurosawa, Satoru Katsumata and Mamoru Komachi
    • Organizer
      1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model2020

    • Author(s)
      Satoru Katsumata and Mamoru Komachi
    • Organizer
      1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Non-Autoregressive Grammatical Error Correction Towards a Writing Support System2020

    • Author(s)
      Hiroki Homma and Mamoru Komachi
    • Organizer
      6th Workshop on Natural Language Processing Techniques for Educational Application (NLP-TEA)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Zero-shot North Korean to English Neural Machine Translation by Character Tokenization and Phoneme Decomposition2020

    • Author(s)
      Hwichan Kim, Tosho Hirasawa and Mamoru Komachi
    • Organizer
      58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop (ACL 2020 SRW)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Automated Essay Scoring System for Nonnative Japanese Learners2020

    • Author(s)
      Reo Hirao, Mio Arai, Hiroki Shimanaka, Satoru Katsumata and Mamoru Komachi
    • Organizer
      12th International Conference on Language Resources and Evaluation (LREC 2020)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Korean to Japanese Neural Machine Translation System Using Hanja Information2020

    • Author(s)
      Hwichan Kim, Tosho Hirasawa and Mamoru Komachi
    • Organizer
      7th Workshop on Asian Translation (WAT)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] TMU System Using BERT-based Pre-trained Model to the NLP-TEA CGED Shared Task 20202020

    • Author(s)
      Hongfei Wang and Mamoru Komachi
    • Organizer
      6th Workshop on Natural Language Processing (NLP-TEA)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] TMUOU submission for WMT20 Quality Estimation Shared Task2020

    • Author(s)
      Akifumi Nakamachi (Osaka University), Hiroki Shimanaka, Tomoyuki Kajiwara (Osaka University) and Mamoru Komachi
    • Organizer
      Fifth Conference on Machine Translation (WMT 2020)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Automated Essay Scoring System for Nonnative Japanese Learners2020

    • Author(s)
      Reo Hirao, Mio Arai, Hiroki Shimanaka, Satoru Katsumata and Mamoru Komachi
    • Organizer
      12th International Conference on Language Resources and Evaluation
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Zero-shot North Korean to English Neural Machine Translation by Character Tokenization and Phoneme Decomposition2020

    • Author(s)
      Hwichan Kim, Tosho Hirasawa and Mamoru Komachi
    • Organizer
      ACL 2020 Student Research Workshop
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Chinese--Japanese Unsupervised Neural Machine Translation Using Sub-character Level Information2019

    • Author(s)
      Longtu Zhang and Mamoru Komachi
    • Organizer
      The 33rd Pacific Asia Conference on Language, Information and Computation
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Debiasing Word Embeddings Improves Multimodal Machine Translation2019

    • Author(s)
      Tosho Hirasawa and Mamoru Komachi
    • Organizer
      17th Machine Translation Summit
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2019-04-18   Modified: 2023-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi