Creating an online accessible database of high-frequency phrases including collocations and chunks with their CEFR levels

Research Project

Project/Area Number	23K21949
Project/Area Number (Other)	22H00677 (2022-2023)
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Multi-year Fund (2024) Single-year Grants (2022-2023)
Section	一般
Review Section	Basic Section 02100:Foreign language education-related
Research Institution	Kyushu University
Principal Investigator	内田諭九州大学, 言語文化研究院, 准教授 (20589254)
Co-Investigator(Kenkyū-buntansha)	荒瀬由紀東京工業大学, 情報理工学院, 教授 (00747165) 工藤洋路玉川大学, 文学部, 教授 (60509173) 石井康毅成城大学, 社会イノベーション学部, 教授 (70530103) ハズウェルクリストファー九州大学, 言語文化研究院, 准教授 (90536088) Danny Minn 北九州市立大学, 基盤教育センター, 准教授 (60382412)
Project Period (FY)	2022-04-01 – 2027-03-31
Project Status	Granted (Fiscal Year 2024)
Budget Amount *help	¥16,900,000 (Direct Cost: ¥13,000,000、Indirect Cost: ¥3,900,000) Fiscal Year 2026: ¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000) Fiscal Year 2025: ¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000) Fiscal Year 2024: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2023: ¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000) Fiscal Year 2022: ¥2,600,000 (Direct Cost: ¥2,000,000、Indirect Cost: ¥600,000)
Keywords	コロケーション / フレーズ / チャンク / CEFR / 発信語彙
Outline of Research at the Start	本研究は、コロケーション・チャンクを中心とした高頻度フレーズのCEFRレベル別のリストを作成し、公開することを目指す。日本の教育現場ではコロケーションに対する意識が低く、十分な指導が行われていない。また、日本人英語学習者に合ったレベル別のリストは存在しない。この問題点を解決するために、本研究では大規模コーパスや生成系AIなどを利用して日本人英語学習者に合ったレベル別のコロケーション・チャンクのリストを作成し、ライティング・スピーキング教育に有効な発信語彙の強化に資する教育資源を公開することを目標とする。
Outline of Annual Research Achievements	本研究では、日本人英語学習者に特化したレベル別のコロケーション・チャンクリストを作成、公開することを目指す。コロケーション・チャンクの難易度は表面上の単語の難易度だけでは決まらない。特に英語と母語（日本語）のずれに起因する場合があり、広く日本の英語教育の向上に資するためには、母語の影響を考慮したリストの作成が不可欠である。また、対象となる単語を含む定形表現（チャンク）をリストアップすることは、英語学習者にとって有益であると考えられる。本目的に従って、2023年度は以下の研究を実施した。【(1)大規模言語モデル（LLM）を用いたコロケーション・チャンクの抽出実験】プロンプトを様々な形で調整し、LLMからコロケーション・チャンクを抽出する方法を試みた。その結果、ChatGPTのAPIを通して効果的に抽出できることが明らかになった。さらに有効な手法を発見するため、次年度以降も継続的に研究を行う。【(2)LLMベースのリストの検証】LLMから作成したコロケーション・チャンクのリストについて、大規模コーパス（COCA）を用いてその有効性を検証した。検証の結果、生成された高頻度語リストは、平均して50％程度一致することが明らかとなった。また、一致しないものに関しても多くの場合で頻度の高いものが抜き出されており、教育効果の高いものであることがわかった。この結果は、国際ジャーナルであるApplied Corpus Linguisticsに投稿し、公開されている（Uchida, 2024）。【(3)シンポジウムの開催】2023年11月18日～19日の日程で、大分県別府市でメソドロジー研究会と共同でシンポジウムを開催した。科研のテーマに関する発表が10件以上あり、関連する研究者と最新の情報を交換することができた。今後もこのような研究会を開催できればと考えている。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 「研究実績の概要」で報告した通り、当初の計画に従って研究が進んでいる。生成系AIにより研究状況が大きく変化したが、Uchida (2024)にまとめたように、コーパスとの関係について世界的に見ても先進的な報告を行った。また、シンポジウムの開催によって研究者同士のつながりも強固になり、今後の研究体制が整ったといえる。したがって、「おおむね順調に進展している」と判断する。
Strategy for Future Research Activity	今後の研究の進め方について、効果的であることが明らかになった生成系AIのさらなる利活用を視野に、関連情報を集め、応用的な研究を進めていく予定である。また、シンポジウム等の開催を行い、研究者のネットワーク拡充も行いたい。現時点の研究計画で、大きな変更は必要ないと考えている。

Report

(2 results)

2023 Annual Research Report
2022 Annual Research Report

Research Products
(39 results)

All 2024 2023 2022

All Journal Article (7 results) (of which Peer Reviewed: 5 results, Open Access: 6 results) Presentation (29 results) (of which Int'l Joint Research: 6 results, Invited: 3 results) Book (3 results)

[Journal Article] Evaluating the Accuracy of ChatGPT in Assessing Writing and Speaking: A Verification Study Using ICNALE GRA2024
- Author(s)
  Satoru Uchida
- Journal Title
  
  Learner Corpus Studies in Asia and the World
  
  Volume: 6 Pages: 1-12
- DOI
  10.24546/0100487710
- ISSN
  2187-6746
- Year and Date
  2024-03-20
- Related Report
  2023 Annual Research Report
- Open Access
[Journal Article] Using early LLMs for corpus linguistics: Examining ChatGPT’s potential and limitations2024
- Author(s)
  Satoru Uchida
- Journal Title
  
  Applied Corpus Linguistics
  
  Volume: 4(1) Issue: 1 Pages: 100089-100089
- DOI
  10.1016/j.acorp.2024.100089
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Profiling English sentences based on CEFR levels2024
- Author(s)
  Satoru Uchida, Yuki Arase, Tomoyuki Kajiwara
- Journal Title
  
  ITL-International Journal of Applied Linguistics
  
  Volume: - Issue: 1 Pages: 103-126
- DOI
  10.1075/itl.22018.uch
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Research on the Relationship between Difficulty Levels of Collocations and Their Constituent Words2023
- Author(s)
  石井康毅
- Journal Title
  
  Journal of Corpus-based Lexicology Studies
  
  Volume: 5 Pages: 39-53
- DOI
  10.24546/0100479384
- ISSN
  2434-169X
- Year and Date
  2023-03-10
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Debating researcher labels in the field of language learning psychology: Do we really have an identity crisis?2023
- Author(s)
  Christopher G. Haswell, Jonathan Shachter
- Journal Title
  
  言語文化論究
  
  Volume: 50 Pages: 53-63
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Producing English as a Lingua Franca online content2023
- Author(s)
  Christopher G. Haswell, Aaron Hahn
- Journal Title
  
  言語文化論究
  
  Volume: 51 Pages: 45-52
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Integrating ChatGPT into the EFL Classroom: Benefits and Challenges2023
- Author(s)
  Murphy, R., Wotley, D., and Minn, D.
- Journal Title
  
  北九州市立大学　基盤教育センター紀要
  
  Volume: 40 Pages: 97-166
- Related Report
  2022 Annual Research Report
[Presentation] Quantitative Insights into ICNALE: Examining Language Feature Indicators and Evaluating with ChatGPT2024
- Author(s)
  Satoru Uchida
- Organizer
  Learner Corpus Studies in Asia and the World LCSAW6
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] 英文ライティングのCEFR-Jレベル判定ツール：CWLA の仕組みと活用例2024
- Author(s)
  内田諭
- Organizer
  2020-2024年度科学研究費補助金基盤研究(A)「CEFR-Jに基づくCAN-DOタスク中心の教授と評価に関する総合的研究」（代表：根岸雅史） CEFR-J 2024 Symposium ワークショップ
- Related Report
  2023 Annual Research Report
[Presentation] CEFR-J Grammar Profileを利用した文法の観点での英文難易度評価2024
- Author(s)
  石井康毅
- Organizer
  2020-2024年度科学研究費補助金基盤研究(A)「CEFR-Jに基づくCAN-DOタスク中心の教授と評価に関する総合的研究」（代表：根岸雅史） CEFR-J 2024 Symposium ワークショップ
- Related Report
  2023 Annual Research Report
[Presentation] 日本語話者を対象とした英語機能表現へのCEFRレベルの付与2024
- Author(s)
  石井康毅
- Organizer
  2020-2024年度科学研究費補助金基盤研究(A)「CEFR-Jに基づくCAN-DOタスク中心の教授と評価に関する総合的研究」（代表：根岸雅史） CEFR-J 2024 Symposium
- Related Report
  2023 Annual Research Report
[Presentation] コーパスとしての生成系AIの有用性と限界2023
- Author(s)
  内田諭
- Organizer
  第49回大会英語コーパス学会
- Related Report
  2023 Annual Research Report
[Presentation] フレーム意味論と英語学習辞書2023
- Author(s)
  内田諭
- Organizer
  2023年度JACET英語辞書研究会第1回例会
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] ChatGPTを使った英文法の検証：コーパス言語学の視点から2023
- Author(s)
  内田諭
- Organizer
  『テオリア』『プラクシス』発刊記念セミナー
- Related Report
  2023 Annual Research Report
[Presentation] AIを用いたスピーキングの評価の可能性：コーパス言語学の視点から2023
- Author(s)
  内田諭
- Organizer
  関東甲信越英語教育学会特別講演会
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] 記号創発ロボットは多義性を獲得できるか：認知言語学の視点から2023
- Author(s)
  内田諭
- Organizer
  第15回LangRobo研究会
- Related Report
  2023 Annual Research Report
[Presentation] Text Profiling2023
- Author(s)
  Satoru Uchida
- Organizer
  CEFR-J 2023 Webinar
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Distractor Generation for Fill-in-the-Blank Exercises by Question Type2023
- Author(s)
  Nana Yoshimi, Tomoyuki Kajiwara, Satoru Uchida, Yuki Arase, Takashi Ninomiya
- Organizer
  61st Annual Meeting of the Association for Computational Linguistics (Student Research Workshop)
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Exploring English collocations’ CEFR levels and frequency in learners’ writing2023
- Author(s)
  Yasutake Ishii and Satoru Uchida
- Organizer
  Vocab@Vic 2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] コロケーションのCEFRレベルを探る：構成単語とのレベル差と学習者の使用状況の観点から2023
- Author(s)
  石井康毅・内田諭
- Organizer
  外国語教育メディア学会関西支部メソドロジー研究部会2023年度第2回研究会・九州大学内田諭科研研究会（共同開催）
- Related Report
  2023 Annual Research Report
[Presentation] How can we learn what academics think about their peers’ opinions?2023
- Author(s)
  Christopher G. Haswell, Jonathan Shachter
- Organizer
  KOTESOL International Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] New routes for qualitative research using podcast interviews2023
- Author(s)
  Christopher G. Haswell, Jonathan Shachter
- Organizer
  PANSIG Conference
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] How experts view English Medium Interaction’s development: A podcast-related research narrative2023
- Author(s)
  Christopher G. Haswell, Jonathan Shachter
- Organizer
  9th Conference on International Higher Education, Tokyo
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] CEFR-SP: CEFR-based sentence difficulty annotation2023
- Author(s)
  Yuki Arase
- Organizer
  外国語教育メディア学会関西支部メソドロジー研究部会2023年度第2回研究会・九州大学内田諭科研研究会（共同開催）
- Related Report
  2023 Annual Research Report
[Presentation] 中学・高等学校の英語授業におけるコロケーション指導2023
- Author(s)
  工藤洋路
- Organizer
  外国語教育メディア学会関西支部メソドロジー研究部会2023年度第2回研究会・九州大学内田諭科研研究会（共同開催）
- Related Report
  2023 Annual Research Report
[Presentation] English as a Lingua Franca and authentic communication2023
- Author(s)
  Christopher G. Haswell
- Organizer
  外国語教育メディア学会関西支部メソドロジー研究部会2023年度第2回研究会・九州大学内田諭科研研究会（共同開催）
- Related Report
  2023 Annual Research Report
[Presentation] CEFR-J Grammar ProfileとText Profileを利用した英文分析2023
- Author(s)
  石井康毅, 内田諭
- Organizer
  2020-2024年度科学研究費補助金基盤研究(A)「CEFR-Jに基づくCAN-DOタスク中心の教授と評価に関する総合的研究」（代表：根岸雅史） CEFR-J 2023 Symposium ワークショップ
- Related Report
  2022 Annual Research Report
[Presentation] Integrating ChatGPT into the EFL Classroom: Benefits and Challenges2023
- Author(s)
  Murphy, R., Wotley, D., and Minn, D.
- Organizer
  言語処理学会第29回年次大会
- Related Report
  2022 Annual Research Report
[Presentation] Classification of Polysemous and Homograph Word Usages using Semi-Supervised Learning2023
- Author(s)
  Han, S., Iwana, B.K., and Uchida, S.
- Organizer
  言語処理学会第29回年次大会
- Related Report
  2022 Annual Research Report
[Presentation] 擬似データを用いた教師あり学習による語彙平易化2023
- Author(s)
  野口夏希, 梶原智之, 荒瀬由紀, 内田諭, 二宮崇
- Organizer
  言語処理学会第29回年次大会
- Related Report
  2022 Annual Research Report
[Presentation] 問題タイプを考慮した英単語穴埋め問題の不正解選択肢の自動生成2023
- Author(s)
  吉見菜那, 梶原智之, 内田諭, 荒瀬由紀, 二宮崇
- Organizer
  言語処理学会第29回年次大会
- Related Report
  2022 Annual Research Report
[Presentation] マスク言語モデルによる英文空所補充問題の解答能力に関する分析2023
- Author(s)
  田中康介, 吉見菜那, 梶原智之, 内田諭, 荒瀬由紀
- Organizer
  情報処理学会第85回全国大会
- Related Report
  2022 Annual Research Report
[Presentation] コーパスから見る類義語の違い2022
- Author(s)
  内田諭
- Organizer
  英語学習や英文法観を考える『テオリア』&『プラクシス』発刊記念セミナー：高校教員の英語学習や英文法観のバージョンアップのために
- Related Report
  2022 Annual Research Report
[Presentation] 機械翻訳の進化と語学教育2022
- Author(s)
  内田諭
- Organizer
  QAOS Brown Bag Seminar No.058
- Related Report
  2022 Annual Research Report
[Presentation] 中高における文法指導へのアプローチ2022
- Author(s)
  工藤洋路
- Organizer
  関西大学大学院外国語教育学研究科英語教育連環センター主催講演会
- Related Report
  2022 Annual Research Report
[Presentation] 高校生の「英語の学習方法がわからない」という認識について―「英語学習に関する継続調査」結果分析から―2022
- Author(s)
  工藤洋路, 長沼君主, 津久井貴之, 森下みゆき, 福本優美子
- Organizer
  全国英語教育学会第47回北海道研究大会
- Related Report
  2022 Annual Research Report
[Book] グローバル社会の英語コミュニケーション・ハンドブック：発話行為・ポライトネス表現辞典付2024
- Author(s)
  川村晶彦（編著）石井康毅・内田諭（分担著）
- Total Pages
  392
- Publisher
  三省堂
- ISBN
  9784385353579
- Related Report
  2023 Annual Research Report
[Book] データを用いたことばとコミュニケーション研究の手法2023
- Author(s)
  大津隆広（編著）内田諭（章執筆）
- Total Pages
  264
- Publisher
  ひつじ書房
- ISBN
  9784823410437
- Related Report
  2023 Annual Research Report
[Book] フレーム意味論とフレームネット2023
- Author(s)
  藤井聖子・内田諭
- Total Pages
  316
- Publisher
  研究社
- ISBN
  9784327401788
- Related Report
  2023 Annual Research Report

Creating an online accessible database of high-frequency phrases including collocations and chunks with their CEFR levels

Principal Investigator

内田 諭 九州大学, 言語文化研究院, 准教授 (20589254)

¥16,900,000 (Direct Cost: ¥13,000,000、Indirect Cost: ¥3,900,000)

Current Status of Research Progress

Reason

Report

Research Products

[Journal Article] Evaluating the Accuracy of ChatGPT in Assessing Writing and Speaking: A Verification Study Using ICNALE GRA2024

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Using early LLMs for corpus linguistics: Examining ChatGPT’s potential and limitations2024

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Profiling English sentences based on CEFR levels2024

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Research on the Relationship between Difficulty Levels of Collocations and Their Constituent Words2023

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Debating researcher labels in the field of language learning psychology: Do we really have an identity crisis?2023

Author(s)

Journal Title

Related Report

[Journal Article] Producing English as a Lingua Franca online content2023

Author(s)

Journal Title

Related Report

[Journal Article] Integrating ChatGPT into the EFL Classroom: Benefits and Challenges2023

Author(s)

Journal Title

Related Report

[Presentation] Quantitative Insights into ICNALE: Examining Language Feature Indicators and Evaluating with ChatGPT2024

Author(s)

Organizer

Related Report

[Presentation] 英文ライティングのCEFR-Jレベル判定ツール：CWLA の仕組みと活用例2024

Author(s)

Organizer

Related Report

[Presentation] CEFR-J Grammar Profileを利用した文法の観点での英文難易度評価2024

Author(s)

Organizer

Related Report

[Presentation] 日本語話者を対象とした英語機能表現へのCEFRレベルの付与2024

Author(s)

Organizer

Related Report

[Presentation] コーパスとしての生成系AIの有用性と限界2023

Author(s)

Organizer

Related Report

[Presentation] フレーム意味論と英語学習辞書2023

Author(s)

Organizer

Related Report

[Presentation] ChatGPTを使った英文法の検証：コーパス言語学の視点から2023

Author(s)

Organizer

Related Report

[Presentation] AIを用いたスピーキングの評価の可能性：コーパス言語学の視点から2023

Author(s)

Organizer

Related Report

[Presentation] 記号創発ロボットは多義性を獲得できるか：認知言語学の視点から2023

Author(s)

Organizer

Related Report

内田諭九州大学, 言語文化研究院, 准教授 (20589254)