2022 Fiscal Year Final Research Report

Robust automated essay scoring method integrating deep neural networks and item response theory

Research Project

PDF

Project/Area Number	20K20817
Research Category	Grant-in-Aid for Challenging Research (Exploratory)
Allocation Type	Multi-year Fund
Review Section	Medium-sized Section 9:Education and related fields
Research Institution	The University of Electro-Communications
Principal Investigator	Uto Masaki 電気通信大学, 大学院情報理工学研究科, 准教授 (10732571)
Project Period (FY)	2020-07-30 – 2023-03-31
Keywords	小論文自動採点 / 項目反応理論 / 深層学習 / 評価者バイアス / 信頼性 / テスト理論
Outline of Final Research Achievements	In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Conventional AES models generally require training on a large dataset of graded essays. However, assigned grades in such a training dataset are known to be biased owing to effects of rater characteristics when grading is conducted by assigning a few raters in a rater set to each essay. Performance of AES models drops when such biased data are used for model training. Researchers in the fields of educational and psychological measurement have recently proposed item response theory (IRT) models that can estimate essay scores while considering effects of rater biases. This study therefore proposed a new method that trains AES models using IRT-based scores for dealing with rater bias within training data.
Free Research Field	教育工学
Academic Significance and Societal Importance of the Research Achievements	自動採点技術の性能は年々更新されてきているものの，その性能改善は微小であり，大幅な性能改善には手法の抜本的な見直しが必要であると考えられる．本研究で指摘している既存手法の問題点は，既存のすべての自動採点手法に当てはまる根本的な問題でありながら，既存研究で見落とされてきた観点である．本研究は，この問題に対して，理論的かつシンプルな解決策を提案するものであり，自動採点の性能を大幅に改善できる可能性を有するとともに，将来的に自動採点手法の基礎フレームワークとなりうる学術的にインパクトの大きい研究であると考える．