Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner

IF 2.7 3区教育学 Q1 EDUCATION & EDUCATIONAL RESEARCH

Assessment in Education-Principles Policy & Practice Pub Date : 2023-07-04 DOI:10.1080/0969594X.2023.2241656

Renske Bouwer, M. Koster, H. van den Bergh

{"title":"Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner","authors":"Renske Bouwer, M. Koster, H. van den Bergh","doi":"10.1080/0969594X.2023.2241656","DOIUrl":null,"url":null,"abstract":"ABSTRACT Assessing students’ writing performance is essential to adequately monitor and promote individual writing development, but it is also a challenge. The present research investigates a benchmark rating procedure for assessing texts written by upper-elementary students. In two studies we examined whether a benchmark rating procedure (1) leads to reliable and generalisable scores that converge with holistic and analytic ratings, and (2) can be used for rating texts varying in topic and genre. Results support evidence that benchmark ratings are a valid indicator of text quality as they converge with holistic and analytic scores. They are also associated with less rater variance and less task-specific variance, leading to reliable and generalisable ratings. Moreover, a benchmark scale can be used for rating different tasks with the same reliability, at least when texts are written in the same genre. Taken together, a benchmark rating procedure ensures meaningful and useful information on students’ writing.","PeriodicalId":51515,"journal":{"name":"Assessment in Education-Principles Policy & Practice","volume":"159 1","pages":"302 - 319"},"PeriodicalIF":2.7000,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assessment in Education-Principles Policy & Practice","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0969594X.2023.2241656","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 5

Abstract

ABSTRACT Assessing students’ writing performance is essential to adequately monitor and promote individual writing development, but it is also a challenge. The present research investigates a benchmark rating procedure for assessing texts written by upper-elementary students. In two studies we examined whether a benchmark rating procedure (1) leads to reliable and generalisable scores that converge with holistic and analytic ratings, and (2) can be used for rating texts varying in topic and genre. Results support evidence that benchmark ratings are a valid indicator of text quality as they converge with holistic and analytic scores. They are also associated with less rater variance and less task-specific variance, leading to reliable and generalisable ratings. Moreover, a benchmark scale can be used for rating different tasks with the same reliability, at least when texts are written in the same genre. Taken together, a benchmark rating procedure ensures meaningful and useful information on students’ writing.

查看原文本刊更多论文

基准评级程序，两全其美?比较程序，以可靠和有效的方式评价文本质量

评估学生的写作表现对于充分监测和促进个人写作发展至关重要，但也是一项挑战。本研究探讨了一种评估小学高年级学生写作的基准评分程序。在两项研究中，我们检验了基准评分程序(1)是否能得出与整体评分和分析评分相一致的可靠且普遍的分数，以及(2)是否能用于对不同主题和体裁的文本进行评分。结果支持的证据表明，基准评级是文本质量的有效指标，因为它们与整体和分析得分收敛。它们还与较小的评分差异和较小的任务特定差异相关，从而导致可靠和可推广的评分。此外，基准量表可以用于对具有相同可靠性的不同任务进行评级，至少当文本以相同体裁编写时是这样。综上所述，基准评分程序确保了学生写作的有意义和有用的信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Assessment in Education-Principles Policy & Practice EDUCATION & EDUCATIONAL RESEARCH-

CiteScore

5.70

自引率

3.10%

发文量

期刊介绍： Recent decades have witnessed significant developments in the field of educational assessment. New approaches to the assessment of student achievement have been complemented by the increasing prominence of educational assessment as a policy issue. In particular, there has been a growth of interest in modes of assessment that promote, as well as measure, standards and quality. These have profound implications for individual learners, institutions and the educational system itself. Assessment in Education provides a focus for scholarly output in the field of assessment. The journal is explicitly international in focus and encourages contributions from a wide range of assessment systems and cultures. The journal''s intention is to explore both commonalities and differences in policy and practice.