Predicting the oral proficiency levels with machine learning techniques
Project/Area Number |
26770205
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Foreign language education
|
Research Institution | Toyo University (2015-2016) Ritsumeikan University (2014) |
Principal Investigator |
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Project Status |
Completed (Fiscal Year 2016)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2016: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2015: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2014: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
|
Keywords | 自動採点 / 学習者コーパス / 自然言語処理 / 機械学習 / コーパス / スピーキング |
Outline of Final Research Achievements |
The present study aims to automatically evaluate second language (L2) spoken English using automated scoring techniques. I used the NICT JLE Corpus, a corpus of 1,281 Japanese EFL learners, which is coded into nine oral proficiency levels, for the analysis. The nine levels were used as a criterion variable and linguistic features analyzed in Biber (1988) as explanatory variables. Random forests was employed to predict oral proficiency. As a result, 61.28% of L2 spoken productions were correctly classified. Compared to the baseline accuracy of the simplest possible algorithm of always choosing the most frequent level (37.63%), my random forests model improved prediction by 23.65 points. Predictors that can clearly discriminate oral proficiency levels were prepositions, first person pronouns, adverbs, and contractions in the order of strength. The results of this study can be applied to creating assessments that are more appropriate for scaling oral performances of EFL learners.
|
Report
(4 results)
Research Products
(29 results)