Development of statistical scoring system for essay-type tests
Project/Area Number |
17300088
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Statistical science
|
Research Institution | Tohoku University (2007) Niigata University (2005-2006) |
Principal Investigator |
SHIBAYAMA Tadashi Tohoku University, Tohoku University, Graduate School of Education, Professor (70240752)
|
Co-Investigator(Kenkyū-buntansha) |
MAEDA Tadahiko The Institute of Statistical Mathematics, Department of Data Science, Associate Professor (10247257)
NITTA Katsumi Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering, Professor (60293073)
NOGUCHI Hiroyuki Nagoya University, Graduate School of Education and Human Development, Professor (60114815)
MACHIMURA Yasutaka Hokkaido University, School of Law, Professor (60199726)
FUJIMOTO Akira Shizuoka University, Law School, Professor (80300474)
|
Project Period (FY) |
2005 – 2007
|
Project Status |
Completed (Fiscal Year 2007)
|
Budget Amount *help |
¥15,640,000 (Direct Cost: ¥14,800,000、Indirect Cost: ¥840,000)
Fiscal Year 2007: ¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000)
Fiscal Year 2006: ¥4,600,000 (Direct Cost: ¥4,600,000)
Fiscal Year 2005: ¥7,400,000 (Direct Cost: ¥7,400,000)
|
Keywords | essay-type examination / essay test / automated scoring / performance assessment / Generalizability theory / test / reliability / assessment / パフォーマンス・アセスメント / 採点 / 論述式 / 欠測値 / キーワードマッチング法 |
Research Abstract |
Procedures 1) In the first experiment the connected scoring design was used. A design matrix consisted of six raters and 290 essays. Six raters rated 260 essays by using heuristic method or analytical method. We obtained an incomplete data matrix with a total of 1560 scores. On the other hand we developed an automated scoring system by using these answers. 2) In the second experiment 10 raters rated 280 essays. We obtained a complete data matrix. These data were used in simulation studies and these answers were used to improve the accuracy of the automated scoring system. Results 1) Optimal scoring design On the basis of simulation studies we proposed a connected scoring design in which two or three raters rated a common block of essays. But each rater dose not rate all essays. Using this design we can obtain the scores more efficiently 2) Reliability assessment We proposed assessment procedures for reliability of the connected scoring design in respect of Generalizability theory and ANOVA approach for an incomplete data matrix. Inter correlation coefficient was only 0.293 in heuristic scoring method. On the other hand, it was 0.498 in analytical scoring method. Furthermore, if the scores were transformed by the proposed method, the coefficient was 0.586. 3) Development of an automated scoring system We developed an automated scoring system on the basis of SVM (Support Vector Machine) and improved the system. If we give about 200 sample answers the system, it is able to rate essays accurately as well as human raters. This result means that the system can assist human raters in scoring procedures.
|
Report
(4 results)
Research Products
(11 results)