Building an Error-Annotated Corpus of Learner Indonesian and Developing an Automated Writing Support for Japanese Students Using Deep Linguistic Indonesian Parsers

研究課題

研究課題/領域番号	23K12235
研究種目	若手研究
配分区分	基金
審査区分	小区分02100:外国語教育関連
研究機関	神田外語大学
研究代表者	MOELJADI David 神田外語大学, 外国語学部, 講師 (60928290)
研究期間 (年度)	2023-04-01 – 2026-03-31
研究課題ステータス	交付 (2023年度)
配分額 *注記	2,470千円 (直接経費: 1,900千円、間接経費: 570千円) 2025年度: 780千円 (直接経費: 600千円、間接経費: 180千円) 2024年度: 910千円 (直接経費: 700千円、間接経費: 210千円) 2023年度: 780千円 (直接経費: 600千円、間接経費: 180千円)
キーワード	learner corpus / Indonesian language / language education / error annotation / feedback system
研究開始時の研究の概要	An error-annotated learner corpus is a very useful source to know types and frequencies of mistakes made by foreign language learners. It can also be employed to develop a Computer Assisted Language Learning system which can provide accurate and immediate feedback. In this research, I focus on the Indonesian language writing skill of Japanese university students taking Indonesian language courses, particularly at Kanda University of International Studies, Tokyo University of Foreign Studies, and Ritsumeikan Asia Pacific University.
研究実績の概要	In 2023 I have gathered more than 1200 written assignments (essays) from more than 300 students (all students gave their consent). The students are from 6 universities: Kanda University of International Studies (KUIS), Tokyo University of Foreign Studies (TUFS), Ritsumeikan Asia Pacific University (APU), Sophia University, Chuo University, and Keio University. I have made and revised an error tagset which currently consists of 4 categories (lexical, grammatical, spelling, and other errors) and 48 error tags. As for the annotation software, I use UAM Corpus Tool version3. I employed 4 Japanese students from KUIS to input the data from the consent forms and to type the handwritten assignments. Four Indonesian teachers from KUIS, TUFS, and APU annotated the corpus.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 Initially, my plan was to gather students' essays from 30 students from 3 universities in Japan (KUIS, TUFS, and APU). However, I managed to gather more than 1200 essays from more than 300 students from 6 universities. Because of the large amount of essays I gathered, the annotation process has not finished yet. At present approximately less than one fourth of the essays have been annotated and checked. In addition, I planned to release the annotated corpus in the first year, but because of the reason mentioned above, I am planning to do it after all the essays have been annotated and checked.
今後の研究の推進方策	During my presentation in a research meeting at TUFS, I received some feedbacks from Malay/Indonesian lecturers and experts. They suggested me to focus on building the learner corpus for 3 years instead of building it for only one year and spend the next two years to develop an automated writing support for students. Building a learner corpus is time consuming and labor consuming. However, it is very important not only for language teaching but also for grammar research and other research purposes. Building a useful and good quality of data source (corpus) is already a big project. Thus, I would like to continue gathering more essays from students and, at the same time, annotating the errors in the essays.

報告書

(1件)

2023 実施状況報告書

研究成果
(6件)

すべて 2023

すべて雑誌論文 (2件) (うち国際共著 1件、査読あり 2件、オープンアクセス 2件) 学会発表 (4件) (うち国際学会 3件)

[雑誌論文] Penyusunan KOPER: Korpus Pemelajar Bahasa Indonesia Beranotasi Eror2023
- 著者名/発表者名
  David Moeljadi
- 雑誌名
  
  Prosiding Kongres Bahasa Indonesia XII
  
  巻: 1 ページ: 429-444
- 関連する報告書
  2023 実施状況報告書
- 査読あり / オープンアクセス
[雑誌論文] NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages2023
- 著者名/発表者名
  Winata Genta Indra、Aji Alham Fikri、Cahyawijaya Samuel、Mahendra Rahmad、Koto Fajri、Romadhony Ade、Kurniawan Kemal、Moeljadi David、Prasojo Radityo Eko、Fung Pascale、Baldwin Timothy、Lau Jey Han、Sennrich Rico、Ruder Sebastian
- 雑誌名
  
  Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
  
  巻: 1 ページ: 815-834
- DOI
  10.18653/v1/2023.eacl-main.57
- 関連する報告書
  2023 実施状況報告書
- 査読あり / オープンアクセス / 国際共著
[学会発表] Penyusunan Koper: Korpus Pemelajar Bahasa Indonesia Beranotasi Eror2023
- 著者名/発表者名
  David Moeljadi
- 学会等名
  Kongres Bahasa Indonesia XII
- 関連する報告書
  2023 実施状況報告書
- 国際学会
[学会発表] エラータグ付きインドネシア語学習者コーパスの構築2023
- 著者名/発表者名
  David Moeljadi
- 学会等名
  日本インドネシア学会第 54回研究大会
- 関連する報告書
  2023 実施状況報告書
[学会発表] A study of morphology of onomatopoeias in Indonesian2023
- 著者名/発表者名
  David Moeljadi
- 学会等名
  The 26th International Symposium on Malay/Indonesian Linguistics (ISMIL)
- 関連する報告書
  2023 実施状況報告書
- 国際学会
[学会発表] Building the Old Javanese Wordnet2023
- 著者名/発表者名
  David Moeljadi
- 学会等名
  International Kawi Culture Festival
- 関連する報告書
  2023 実施状況報告書
- 国際学会

Building an Error-Annotated Corpus of Learner Indonesian and Developing an Automated Writing Support for Japanese Students Using Deep Linguistic Indonesian Parsers

研究代表者

MOELJADI David 神田外語大学, 外国語学部, 講師 (60928290)

2,470千円 (直接経費: 1,900千円、間接経費: 570千円)

現在までの達成度 (区分)

理由

報告書

研究成果

[雑誌論文] Penyusunan KOPER: Korpus Pemelajar Bahasa Indonesia Beranotasi Eror2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[学会発表] Penyusunan Koper: Korpus Pemelajar Bahasa Indonesia Beranotasi Eror2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] エラータグ付きインドネシア語学習者コーパスの構築2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] A study of morphology of onomatopoeias in Indonesian2023

著者名/発表者名

学会等名

関連する報告書

[学会発表] Building the Old Javanese Wordnet2023

著者名/発表者名

学会等名

関連する報告書