Checkbox grading of large-scale mathematics exams with multiple assessors: Field study on assessors’ inter-rater reliability, time investment and usage experience
{"title":"Checkbox grading of large-scale mathematics exams with multiple assessors: Field study on assessors’ inter-rater reliability, time investment and usage experience","authors":"Filip Moons , Ellen Vandervieren , Jozef Colpaert","doi":"10.1016/j.stueduc.2024.101443","DOIUrl":null,"url":null,"abstract":"<div><div>Assessing exams with multiple assessors is challenging regarding inter-rater reliability and feedback. This paper presents ‘checkbox grading,’ a digital method where exam designers have predefined checkboxes with both feedback and associated partial grades. Assessors then tick the checkboxes relevant to a student solution. Dependencies between checkboxes ensure consistency among assessors in following the grading scheme. Moreover, the approach supports ‘blind grading’ by hiding the grades associated with the checkboxes, thus focusing assessors on the criteria rather than the scores. The approach was studied during a large-scale mathematics state exam. Results show that assessors perceived checkbox grading as very useful. However, compared to traditional grading—where assessors follow a correction scheme and communicate the resulting grade—more time is spent on checkbox grading, while both approaches are equally reliable. Blind grading improved inter-rater reliability for some tasks. Overall, checkbox grading might lead to a smoother process where feedback, not solely grades, is communicated to students.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"85 ","pages":"Article 101443"},"PeriodicalIF":2.6000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Educational Evaluation","FirstCategoryId":"95","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0191491X24001299","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Assessing exams with multiple assessors is challenging regarding inter-rater reliability and feedback. This paper presents ‘checkbox grading,’ a digital method where exam designers have predefined checkboxes with both feedback and associated partial grades. Assessors then tick the checkboxes relevant to a student solution. Dependencies between checkboxes ensure consistency among assessors in following the grading scheme. Moreover, the approach supports ‘blind grading’ by hiding the grades associated with the checkboxes, thus focusing assessors on the criteria rather than the scores. The approach was studied during a large-scale mathematics state exam. Results show that assessors perceived checkbox grading as very useful. However, compared to traditional grading—where assessors follow a correction scheme and communicate the resulting grade—more time is spent on checkbox grading, while both approaches are equally reliable. Blind grading improved inter-rater reliability for some tasks. Overall, checkbox grading might lead to a smoother process where feedback, not solely grades, is communicated to students.
期刊介绍:
Studies in Educational Evaluation publishes original reports of evaluation studies. Four types of articles are published by the journal: (a) Empirical evaluation studies representing evaluation practice in educational systems around the world; (b) Theoretical reflections and empirical studies related to issues involved in the evaluation of educational programs, educational institutions, educational personnel and student assessment; (c) Articles summarizing the state-of-the-art concerning specific topics in evaluation in general or in a particular country or group of countries; (d) Book reviews and brief abstracts of evaluation studies.