2021 Fiscal Year Research-status Report
Developing a program for language teaching with parsed corpora
Project/Area Number |
19K00541
|
Research Institution | Hirosaki University |
Principal Investigator |
バトラー アラステア 弘前大学, 人文社会科学部, 准教授 (90588873)
|
Project Period (FY) |
2019-04-01 – 2023-03-31
|
Keywords | grammatical analysis / parsed corpora / language teaching / English / Japanese |
Outline of Annual Research Achievements |
The implementation plan is to develop a program for language teaching with parsed corpora. The components are: 1) a grammar textbook focused on English language learning for Japanese students at university level, 2) a large grammatically analysed corpus of English, also linked to Japanese language analysis for purposes of comparison, and 3) the development of a "toolkit" for analysis creation, for students to start analysing their own written language. The goal is to empower students to critically analyse their own use of language and be drawn to explore wider insights from the grammatically analysed corpus. The third year of the project has seen further development in all three components of the project.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
Textbook development continued, branched into: (i) an introductory guide for the English parsed corpus, and (ii) a supplement to a published textbook linking to corpus queries. Results have been released on the web. While the size of the analysed corpus hasn't increased (43,835 trees; 467,414 words), many improvements were made to the analysis. Most notably verb codes were added (currently 34,406 completed instances, but 42,051 incomplete instances) to assist with word sense disambiguation. The largest amount of work went into improving the online corpus interface. This was described in a conference presentation (LENLS 18), and is available from https://entrees.github.io/.
|
Strategy for Future Research Activity |
The parsing guide will be further enlarged and refined. Further content will be added to the textbook supplement for it to become useful as an independent resource. The dependency analysis gained from the "toolkit" will be integrated into the corpus interface to offer an alternative way for students to gain a strong feeling for how words interact with each other.
|
Causes of Carryover |
In the next fiscal year annotation of the English corpus will continue to add verb code information for word sense disambiguation to assist students with gaining knowledge of vocabulary and word use. Enhancements will also be made to the toolkit for analysing English to improve coverage of language phenomena. Improvements will include the integration of verb code information into the toolkit. This will assist with disambiguating the grammatical analysis so that more sentence ambiguity can be automatically resolved. Additionally, improvements will be made to the online corpus interface.
|
Research Products
(2 results)