2004 Fiscal Year Final Research Report Summary
A Research on Measuring Higher Order Thinking Abilities via Performances
Project/Area Number |
14510158
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
教育・社会系心理学
|
Research Institution | Tokyo Metropolitan University |
Principal Investigator |
HIRAI Yoko Tokyo Metropolitan University, Faculty of Social Sciences and Humanities, Associate Professor, 人文学部, 助教授 (40285078)
|
Project Period (FY) |
2002 – 2004
|
Keywords | performance assessment / writing task / generalizability theory / rater error / scoring rubric / rater training |
Research Abstract |
Studies on the measurement of higher order thinking abilities via performances were conducted. First, NAEP was examined to find how it ensures the consistency of ratings by training and monitoring of raters. Second, previous studies concerning rater errors were reviewed in terms of three categories ; studies based on descriptive error indices, studies based on the generalizability theory, and studies based on the FACETS model. Brief methodological summaries for each category accompanied. It was found that the errors due to raters (main effect and interactions) could be kept to an ignorable degree if the sufficient rater training and the detailed scoring rubrics were given. Rather, the substantial problem for the performance assessment was the subject by task interaction ; subjects showed inconsistent performance from task to task in almost all the previous studies reviewed. It can be said, then, that several small-in-size and focused-in-single-trait tasks, balanced in content, were preferable in order to attain the more accurate and fair assessment. Third, based on the writings of 268 subjects, analysis were performed in terms of the effects of particularization of scoring rubrics, the number of raters, and the number of tasks. It was found that the thirty minutes training might not suffice to make the raters' judgments uniform, and in that situation, the particularization of the rubrics did not add to the rater reliability. It was also found that the trait-focused scoring need not lead to the greater consistency or agreement between raters than the holistic scoring, and that at least two raters were needed for a relative decision and at least four raters were needed for an absolute decision. Finally, a research design for the following study was discussed and the development of new tasks of writing and thinking abilities and the new inventory of thinking styles were reported.
|