• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2021 Fiscal Year Research-status Report

Developing a program for language teaching with parsed corpora

Research Project

Project/Area Number 19K00541
Research InstitutionHirosaki University

Principal Investigator

バトラー アラステア  弘前大学, 人文社会科学部, 准教授 (90588873)

Project Period (FY) 2019-04-01 – 2023-03-31
Keywordsgrammatical analysis / parsed corpora / language teaching / English / Japanese
Outline of Annual Research Achievements

The implementation plan is to develop a program for language teaching with parsed corpora. The components are: 1) a grammar textbook focused on English language learning for Japanese students at university level, 2) a large grammatically analysed corpus of English, also linked to Japanese language analysis for purposes of comparison, and 3) the development of a "toolkit" for analysis creation, for students to start analysing their own written language. The goal is to empower students to critically analyse their own use of language and be drawn to explore wider insights from the grammatically analysed corpus. The third year of the project has seen further development in all three components of the project.

Current Status of Research Progress
Current Status of Research Progress

2: Research has progressed on the whole more than it was originally planned.

Reason

Textbook development continued, branched into: (i) an introductory guide for the English parsed corpus, and (ii) a supplement to a published textbook linking to corpus queries. Results have been released on the web. While the size of the analysed corpus hasn't increased (43,835 trees; 467,414 words), many improvements were made to the analysis. Most notably verb codes were added (currently 34,406 completed instances, but 42,051 incomplete instances) to assist with word sense disambiguation. The largest amount of work went into improving the online corpus interface. This was described in a conference presentation (LENLS 18), and is available from https://entrees.github.io/.

Strategy for Future Research Activity

The parsing guide will be further enlarged and refined. Further content will be added to the textbook supplement for it to become useful as an independent resource. The dependency analysis gained from the "toolkit" will be integrated into the corpus interface to offer an alternative way for students to gain a strong feeling for how words interact with each other.

Causes of Carryover

In the next fiscal year annotation of the English corpus will continue to add verb code information for word sense disambiguation to assist students with gaining knowledge of vocabulary and word use. Enhancements will also be made to the toolkit for analysing English to improve coverage of language phenomena. Improvements will include the integration of verb code information into the toolkit. This will assist with disambiguating the grammatical analysis so that more sentence ambiguity can be automatically resolved. Additionally, improvements will be made to the online corpus interface.

  • Research Products

    (2 results)

All 2021

All Journal Article (1 results) (of which Int'l Joint Research: 1 results,  Peer Reviewed: 1 results) Presentation (1 results)

  • [Journal Article] Knowledge Acquisition from Natural Language with Treebank Semantics and FLORA-22021

    • Author(s)
      Alastair Butler
    • Journal Title

      Lecture Notes in Computer Science, New Frontiers in Artificial Intelligence. JSAI-isAI 2020

      Volume: 12758 Pages: 37-49

    • DOI

      10.1007/978-3-030-79942-7_3

    • Peer Reviewed / Int'l Joint Research
  • [Presentation] Parsed corpus development with a quick access interface2021

    • Author(s)
      Alastair Butler
    • Organizer
      Logic and Engineering of Natural Language Semantics 18 (LENLS18)

URL: 

Published: 2022-12-28  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi