Acquisition and Analysis of Distributed Representations of Words and Sentences Focusing on Language Learners' Errors

Research Project

Project/Area Number	19KK0286
Research Category	Fund for the Promotion of Joint International Research (Fostering Joint International Research (A))
Allocation Type	Multi-year Fund
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Hitotsubashi University (2023) Tokyo Metropolitan University (2019-2022)
Principal Investigator	KOMACHI Mamoru 一橋大学, 大学院ソーシャル・データサイエンス研究科, 教授 (60581329)
Project Period (FY)	2020 – 2023
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥12,610,000 (Direct Cost: ¥9,700,000、Indirect Cost: ¥2,910,000)
Keywords	自然言語処理 / 深層学習 / 文法誤り訂正 / 言語学習支援 / 言語教育支援 / 言語学習 / 事前学習モデル / 疑似誤り
Outline of Research at the Start	本国際共同研究では、言語学習者のライティングに存在する「誤り」に着目し、言語学習者がどのような単語や文を書いているのかの分析を行います。言語学習者が実際に書いた文章だけでなく、大規模データを用いて擬似的に誤りを発生させることで、様々な種類の誤りを分析するだけでなく、複数の言語で比較することで言語横断的な分析をも可能にする、というのが本研究の狙いです。
Outline of Final Research Achievements	We have worked on the construction of datasets for grammatical error correction in English, Japanese, and Chinese, as well as the analysis and comprehensive evaluation of outputs from multilingual grammatical error correction systems using deep learning. Below is an overview of the research achievements conducted throughout the research period: (1) Application of pre-trained models to grammatical error correction, (2) Acceleration of grammatical error correction systems, (3) Proposal of a grammatical error correction method using a pseudo-learner corpus considering learners' errors, (4) Analysis and improvement of the diversity of grammatical error correction outputs, (5) Transfer learning of language knowledge for grammatical error correction using multilingual models, and (6) Development and dataset construction of an automatic evaluation method for grammatical error correction
Academic Significance and Societal Importance of the Research Achievements	本研究を通じて、最先端の深層学習を用いた文法誤り訂正手法の到達点と限界について明らかになりました。文法誤り訂正の性能が見違えるように改善された一方、これまで用いられてきた文法誤り訂正の評価データセットが深層学習時代の文法誤り訂正手法の評価には適さないことが明らかになり、多言語での評価用のデータセットの構築や、それらを用いた適切な評価尺度の開発の必要性が示唆され、言語学習者の誤用の評価の重要性が再確認されました。

Report

(5 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report
2021 Research-status Report
2020 Research-status Report

Research Products
(27 results)

All 2024 2023 2022 2021 2020

All Int'l Joint Research (1 results) Journal Article (9 results) (of which Peer Reviewed: 9 results, Open Access: 9 results) Presentation (17 results) (of which Int'l Joint Research: 9 results)

[Int'l Joint Research] ケンブリッジ大学(英国)2023
- Year and Date
  2023-06-27
- Related Report
  2023 Annual Research Report
[Journal Article] Revisiting Meta-evaluation for Grammatical Error Correction2024
- Author(s)
  Masamune Kobayashi, Masato Mita, Mamoru Komachi
- Journal Title
  
  Transactions of the Association for Computational Linguistics
  
  Volume: －
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Construction of an Error-Tagged Evaluation Corpus for Japanese Grammatical Error Correction2023
- Author(s)
  Koyama Aomi、Kiyuna Tomoshige、Kobayashi Kenji、Arai Mio、Mita Masato、Oka Teruaki、Komachi Mamoru
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 30 Issue: 2 Pages: 330-371
- DOI
  10.5715/jnlp.30.330
- ISSN
  1340-7619, 2185-8314
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] 日本語文法誤り訂正のための誤用タグ付き評価コーパスの構築2023
- Author(s)
  小山碧海, 喜友名朝視顕, 小林賢治, 新井美桜, 三田雅人, 岡照晃, 小町守
- Journal Title
  
  自然言語処理,
  
  Volume: 30
- Related Report
  2022 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Japanese Writing Support System with Fast Grammatical Error Correction2022
- Author(s)
  本間広樹, 小町守
- Journal Title
  
  Transactions of the Japanese Society for Artificial Intelligence
  
  Volume: 37 Issue: 1 Pages: B-L22_1-14
- DOI
  10.1527/tjsai.37-1_B-L22
- NAID
  130008139189
- ISSN
  1346-0714, 1346-8030
- Year and Date
  2022-01-01
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Chinese Grammatical Error Correction Using Pre-trained Models and Pseudo Data2022
- Author(s)
  Hongfei Wang, Michiki Kurosawa, Satoru Katsumata, Masato Mita and Mamoru Komachi
- Journal Title
  
  ACM Transactions on Asian and Low-Resource Language Information Processing
  
  Volume: 22 Issue: 3 Pages: 1-12
- DOI
  10.1145/3570209
- Related Report
  2022 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Grammatical Error Correction with Pre-trained Model and Multilingual Learner Corpus for Cross-lingual Transfer Learning2022
- Author(s)
  山下郁海, 金子正弘, 三田雅人, 勝又智, Imankulova Aizhan, 小町守
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 29 Issue: 2 Pages: 314-343
- DOI
  10.5715/jnlp.29.314
- ISSN
  1340-7619, 2185-8314
- Related Report
  2022 Research-status Report 2021 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Optimization of Reference-less Evaluation Metric of Grammatical Error Correction for Manual Evaluations2021
- Author(s)
  吉村綾馬, 金子正弘, 梶原智之, 小町守
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 28 Issue: 2 Pages: 404-427
- DOI
  10.5715/jnlp.28.404
- NAID
  130008052579
- ISSN
  1340-7619, 2185-8314
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Generation of Diverse Corrected Sentences Considering the Degree of Correction2021
- Author(s)
  甫立健悟, 金子正弘, 勝又智, 小町守
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 28 Issue: 2 Pages: 428-449
- DOI
  10.5715/jnlp.28.428
- NAID
  130008052577
- ISSN
  1340-7619, 2185-8314
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] 文法誤り訂正における訂正度を考慮した多様な訂正文の生成2021
- Author(s)
  甫立健悟, 金子正弘, 勝又智, 小町守
- Journal Title
  
  自然言語処理
  
  Volume: 28
- NAID
  130008052577
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access
[Presentation] Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction2024
- Author(s)
  Masamune Kobayashi, Masato Mita, Mamoru Komachi
- Organizer
  19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] 文法誤り訂正の包括的メタ評価: 既存自動評価の限界と大規模言語モデルの可能性2024
- Author(s)
  小林正宗, 三田雅人, 小町守
- Organizer
  言語処理学会第30回年次大会
- Related Report
  2023 Annual Research Report
[Presentation] 文法誤り訂正におけるメタ評価の再考2023
- Author(s)
  小林正宗, 三田雅人, 小町守
- Organizer
  情報処理学会第258回自然言語処理・第149回音声言語情報処理合同研究発表会
- Related Report
  2023 Annual Research Report
[Presentation] Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction2022
- Author(s)
  Daisuke Suzuki, Yujin Takahashi, Ikumi Yamashita, Taichi Aida, Tosho Hirasawa, Michitaka Nakatsuji, Masato Mita, Mamoru Komachi
- Organizer
  13th Edition of Language Resources and Evaluation Conference (LREC 2022)
- Related Report
  2022 Research-status Report 2021 Research-status Report
- Int'l Joint Research
[Presentation] ProQE: Proficiency-wise Quality Estimation Dataset for Grammatical Error Correction2022
- Author(s)
  Yujin Takahashi, Masahiro Kaneko, Masato Mita, Mamoru Komachi
- Organizer
  13th Edition of Language Resources and Evaluation Conference (LREC 2022)
- Related Report
  2022 Research-status Report 2021 Research-status Report
- Int'l Joint Research
[Presentation] TMU Feedback Comment Generation System Using Pretrained Sequence-to-Sequence Language Models2022
- Author(s)
  Naoya Ueda and Mamoru Komachi
- Organizer
  GenChal 2022: Feedback Comment Generation for Writing Learning
- Related Report
  2022 Research-status Report
- Int'l Joint Research
[Presentation] 日本語文法誤り訂正コーパスへの誤用タグ付け2022
- Author(s)
  小山碧海, 喜友名朝視顕, 三田雅人, 岡照晃, 小町守
- Organizer
  情報処理学会研究報告自然言語処理
- Related Report
  2022 Research-status Report
[Presentation] ニューラル文法誤り訂正システムにおけるリランキングの改善に向けたオラクル分析2022
- Author(s)
  小林正宗, 高橋悠進, 三田雅人, 小町守
- Organizer
  言語処理学会第28回年次大会
- Related Report
  2021 Research-status Report
[Presentation] 日本語文法誤り訂正の流暢性評価に向けたデータ作成2022
- Author(s)
  木山朔, 上坂奏人, 佐藤郁子, 佐藤京也, 米田悠人, 小山碧海, 三田雅人, 岡照晃, 小町守
- Organizer
  言語処理学会第28回年次大会
- Related Report
  2021 Research-status Report
[Presentation] Comparison of Grammatical Error Correction Using Back-Translation Models2021
- Author(s)
  Aomi Koyama, Kengo Hotate, Masahiro Kaneko and Mamoru Komachi
- Organizer
  2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Presentation] 学習者データに対する擬似誤り生成を用いた文法誤り訂正モデルの分析2021
- Author(s)
  小山碧海，金子正弘，小町守
- Organizer
  YANS 2021
- Related Report
  2021 Research-status Report
[Presentation] 疑似データによるデータ拡張を行った文法誤り検出モデルの未知の誤りパターンに対する性能評価2021
- Author(s)
  上田直生也，山下郁海，高橋悠進，平澤寅庄，小町守
- Organizer
  YANS 2021
- Related Report
  2021 Research-status Report
[Presentation] Prompting を用いた GPT による文法誤り訂正の検討とその分析2021
- Author(s)
  中辻充恭，山下郁海，高橋悠進，平澤寅庄，小町守
- Organizer
  YANS 2021
- Related Report
  2021 Research-status Report
[Presentation] Comparison of Grammatical Error Correction Using Back-Translation Models2021
- Author(s)
  Aomi Koyama, Kengo Hotate, Masahiro Kaneko and Mamoru Komachi
- Organizer
  NAACL Student Research Workshop (SRW) 2021
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction2020
- Author(s)
  Kengo Hotate, Masahiro Kaneko and Mamoru Komachi
- Organizer
  28th International Conference on Computational Linguistics (COLING)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner's Error Tendency2020
- Author(s)
  Yujin Takahashi, Satoru Katsumata and Mamoru Komachi
- Organizer
  58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop (ACL 2020 SRW)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Construction of an Evaluation Corpus for Grammatical Error Correction for Learners of Japanese as a Second Language2020
- Author(s)
  Aomi Koyama, Tomoshige Kiyuna, Kenji Kobayashi, Mio Arai and Mamoru Komachi
- Organizer
  12th International Conference on Language Resources and Evaluation (LREC 2020)
- Related Report
  2020 Research-status Report
- Int'l Joint Research

Acquisition and Analysis of Distributed Representations of Words and Sentences Focusing on Language Learners' Errors

Principal Investigator

KOMACHI Mamoru 一橋大学, 大学院ソーシャル・データサイエンス研究科, 教授 (60581329)

¥12,610,000 (Direct Cost: ¥9,700,000、Indirect Cost: ¥2,910,000)

Report

Research Products

[Int'l Joint Research] ケンブリッジ大学(英国)2023

Year and Date

Related Report

[Journal Article] Revisiting Meta-evaluation for Grammatical Error Correction2024

Author(s)

Journal Title

Related Report

[Journal Article] Construction of an Error-Tagged Evaluation Corpus for Japanese Grammatical Error Correction2023

Author(s)

Journal Title

DOI

ISSN

Related Report

[Journal Article] 日本語文法誤り訂正のための誤用タグ付き評価コーパスの構築2023

Author(s)

Journal Title

Related Report

[Journal Article] Japanese Writing Support System with Fast Grammatical Error Correction2022

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Journal Article] Chinese Grammatical Error Correction Using Pre-trained Models and Pseudo Data2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Grammatical Error Correction with Pre-trained Model and Multilingual Learner Corpus for Cross-lingual Transfer Learning2022

Author(s)

Journal Title

DOI

ISSN

Related Report

[Journal Article] Optimization of Reference-less Evaluation Metric of Grammatical Error Correction for Manual Evaluations2021

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Generation of Diverse Corrected Sentences Considering the Degree of Correction2021

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] 文法誤り訂正における訂正度を考慮した多様な訂正文の生成2021

Author(s)

Journal Title

NAID

Related Report

[Presentation] Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction2024

Author(s)

Organizer

Related Report

[Presentation] 文法誤り訂正の包括的メタ評価: 既存自動評価の限界と大規模言語モデルの可能性2024

Author(s)

Organizer

Related Report

[Presentation] 文法誤り訂正におけるメタ評価の再考2023

Author(s)

Organizer

Related Report

[Presentation] Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction2022

Author(s)

Organizer

Related Report

[Presentation] ProQE: Proficiency-wise Quality Estimation Dataset for Grammatical Error Correction2022

Author(s)

Organizer