• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2017 Fiscal Year Annual Research Report

Investigating a Learner Corpus of Computer-mediated Communication

Research Project

Project/Area Number 26580077
Research InstitutionGakushuin University

Principal Investigator

MARCHAND Tim  学習院大学, 国際社会科学部, 准教授 (20645197)

Co-Investigator(Kenkyū-buntansha) 阿久津 純恵  東洋大学, ライフデザイン学部, 講師 (20460024)
Project Period (FY) 2014-04-01 – 2018-03-31
Keywordslearner corpus / CMC / longitudinal development / mixed-effects regression
Outline of Annual Research Achievements

(1) Initiated a more robust, three-step process to identify and tag spelling errors, resulting in 5567 potential replacements tagged in the learner corpus. US and UK spelling alternates have also been identified and replaced where necessary for more accurate bigram analysis with large reference corpora (such as COCA).
(2) Tested the POS-tag accuracy for the learner corpus by comparing a manual tagged random token sample from the corpus with the tagged output from WMatrix. Accuracy figures for the CLAWS7 tags: Precision 0.974 (0.980), Recall 0.977 (0.981) and F-measure 0.976 (0.981) with the figures in parentheses representing the results of modification after manual correcting for some of the corpus-based tagging errors.
(3) Identification of learner proficiency levels, through the analysis of the questionnaire data. Learners placed into CEFR equivalent proficiency:
A1 13% A2 19% B1 39% B2 16%C1 2%NA 10%
(4) Although incomplete, have piloted Mixed-effects regression models to find correlations between learner profile and longitudinal development. Initial results suggest that the most significant correlations occur between learner engagement variables and development, rather than proficiency level, although this needs to be examined more thoroughly.

URL: 

Published: 2018-12-17  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi