A Research on Measuring Higher Order Thinking Abilities via Performances
Project/Area Number |
14510158
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
教育・社会系心理学
|
Research Institution | Tokyo Metropolitan University |
Principal Investigator |
HIRAI Yoko Tokyo Metropolitan University, Faculty of Social Sciences and Humanities, Associate Professor, 人文学部, 助教授 (40285078)
|
Project Period (FY) |
2002 – 2004
|
Project Status |
Completed (Fiscal Year 2004)
|
Budget Amount *help |
¥3,500,000 (Direct Cost: ¥3,500,000)
Fiscal Year 2004: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2003: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2002: ¥1,400,000 (Direct Cost: ¥1,400,000)
|
Keywords | performance assessment / writing task / generalizability theory / rater error / scoring rubric / rater training / 高次思考能力 / 測定の妥当性 / 主観的評定 / 公平性 / 小論文 / 評定 / 論理的思考能力 / 批判的思考能力 / 学習スキル / 思考スタイル / 文系の大学生 |
Research Abstract |
Studies on the measurement of higher order thinking abilities via performances were conducted. First, NAEP was examined to find how it ensures the consistency of ratings by training and monitoring of raters. Second, previous studies concerning rater errors were reviewed in terms of three categories ; studies based on descriptive error indices, studies based on the generalizability theory, and studies based on the FACETS model. Brief methodological summaries for each category accompanied. It was found that the errors due to raters (main effect and interactions) could be kept to an ignorable degree if the sufficient rater training and the detailed scoring rubrics were given. Rather, the substantial problem for the performance assessment was the subject by task interaction ; subjects showed inconsistent performance from task to task in almost all the previous studies reviewed. It can be said, then, that several small-in-size and focused-in-single-trait tasks, balanced in content, were preferable in order to attain the more accurate and fair assessment. Third, based on the writings of 268 subjects, analysis were performed in terms of the effects of particularization of scoring rubrics, the number of raters, and the number of tasks. It was found that the thirty minutes training might not suffice to make the raters' judgments uniform, and in that situation, the particularization of the rubrics did not add to the rater reliability. It was also found that the trait-focused scoring need not lead to the greater consistency or agreement between raters than the holistic scoring, and that at least two raters were needed for a relative decision and at least four raters were needed for an absolute decision. Finally, a research design for the following study was discussed and the development of new tasks of writing and thinking abilities and the new inventory of thinking styles were reported.
|
Report
(4 results)
Research Products
(3 results)