• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Automatic collocation generation for English learners as a foreign language using document similarity analysis

Research Project

Project/Area Number 16K00489
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Learning support system
Research InstitutionTsuda University

Principal Investigator

Kishi Nobuko  津田塾大学, 学芸学部, 教授 (50245990)

Co-Investigator(Kenkyū-buntansha) 岸 康人  神奈川大学, 付置研究所, 研究員 (50552999)
田近 裕子  津田塾大学, 総合政策学部, 教授 (80188268)
久島 智津子  津田塾大学, 言語文化研究所, 研究員 (80623876)
Project Period (FY) 2016-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2018: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2017: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2016: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords英語学習 / 文書類似度 / 文書分類 / 潜在意味解析 / 教材自動生成 / 機械学習 / 語彙学習 / Latent Semantic Analysis / 教材自動作成 / 教材作成 / 教材生成 / 学習コンテンツ開発支援
Outline of Final Research Achievements

This study uses three types of document similarity evaluation methods: latent semantic analysis, bag of words, term-frequency and inverse document frequency, to generate English collocations for the learners of English as a foreign language. In the previous study, we find the latent semantics analysis is more suitable for generating collocations for English for specific purposes. However, the generated collocations were not usable as real learning materials because the difficulty level of collocations are not considered, and the subject area is limited.
In this study, we used more computational resources to increase the speed of calculation and the quantity of documents. Furthermore, we used two sets of documents: an easy set and a difficult set, to estimate the difficulty level of collocations based on the different similarities to the two sets. We also added other algorithms to calculate the similarity from shallow machine learning algorithms such as word2vec.

Academic Significance and Societal Importance of the Research Achievements

この研究は、第2言語として英語を学ぶ学習者に、学習者の興味や習熟度にあった教材を自動生成する研究の一環として行っている。社会人や大学生の英語学習者の場合、本人の仕事や専門分野で実際に使われる表現の習得を効率的に行うことが望ましいが、適した教材(教科書、書籍、動画など)は非常に少ない。一方、Wikipediaや各種オープンコンテンツの普及により、英語テキストは入手しやすくなっている。そこで、情報検索分野で使われている、潜在意味解析、頻度分析などの手法を利用して、大規模テキストデータから、教材の素材となる用例(英語の分離)の自動抽出を行った。

Report

(5 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • 2017 Research-status Report
  • 2016 Research-status Report
  • Research Products

    (1 results)

All 2019

All Book (1 results)

  • [Book] Visualization Tool for Finding Characteristics of Teaching and Learning Process of Scratch Programmers, Constructionism 2018 conference proceedings2019

    • Author(s)
      Nobuko Kishi, Mari Yoshida, Minori Yoshizawa, Aoi Yoshida
    • Total Pages
      9
    • Publisher
      Vilnius University Faculty of Philosophy and Institute of Data Science and Digital Technologies
    • ISBN
      9786099576015
    • Related Report
      2019 Annual Research Report

URL: 

Published: 2016-04-21   Modified: 2023-07-20  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi