Beyond literacy and competency – The effects of raters’ perceived uncertainty on assessment of writing

IF 4.2 1区文学 Q1 EDUCATION & EDUCATIONAL RESEARCH

Assessing Writing Pub Date : 2023-07-01 DOI:10.1016/j.asw.2023.100768

Mari Honko , Reeta Neittaanmäki , Scott Jarvis , Ari Huhta

{"title":"Beyond literacy and competency – The effects of raters’ perceived uncertainty on assessment of writing","authors":"Mari Honko , Reeta Neittaanmäki , Scott Jarvis , Ari Huhta","doi":"10.1016/j.asw.2023.100768","DOIUrl":null,"url":null,"abstract":"<div><p>This study investigated how common raters’ experiences of uncertainty in high-stakes testing are before, during, and after the rating of writing performances, what these feelings of uncertainty are, and what reasons might underlie such feelings. We also examined if uncertainty was related to raters’ rating experience or to the quality of their ratings. The data were gathered from the writing raters (n = 23) in the Finnish National Certificates of Proficiency, a standardized Finnish high-stakes language examination. The data comprise 12,118 ratings as well as raters’ survey responses and notes during rating sessions. The responses were analyzed by using thematic content analysis and the ratings by descriptive statistics and Many-Facets Rasch analyses. The results show that uncertainty is variable and individual, and that even highly experienced raters can feel unsure about (some of) their ratings. However, uncertainty was not related to rating quality (consistency or severity/leniency). Nor did uncertainty diminish with growing experience. Uncertainty during actual ratings was typically associated with the characteristics of the rated performances but also with other, more general and rater-related or situational factors. Other reasons external to the rating session were also identified for uncertainty, such as those related to the raters themselves. An analysis of the double-rated performances shows that although similar performance-related reasons seemed to cause uncertainty for different raters, their uncertainty was largely associated with different test-takers’ performances. While uncertainty can be seen as a natural part of holistic ratings in high-stakes tests, the study shows that even if uncertainty is not associated with the quality of ratings, we should constantly seek ways to address uncertainty in language testing, for example by developing rating scales and rater training. This may make raters’ work easier and less burdensome.</p></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assessing Writing","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1075293523000764","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

Abstract

This study investigated how common raters’ experiences of uncertainty in high-stakes testing are before, during, and after the rating of writing performances, what these feelings of uncertainty are, and what reasons might underlie such feelings. We also examined if uncertainty was related to raters’ rating experience or to the quality of their ratings. The data were gathered from the writing raters (n = 23) in the Finnish National Certificates of Proficiency, a standardized Finnish high-stakes language examination. The data comprise 12,118 ratings as well as raters’ survey responses and notes during rating sessions. The responses were analyzed by using thematic content analysis and the ratings by descriptive statistics and Many-Facets Rasch analyses. The results show that uncertainty is variable and individual, and that even highly experienced raters can feel unsure about (some of) their ratings. However, uncertainty was not related to rating quality (consistency or severity/leniency). Nor did uncertainty diminish with growing experience. Uncertainty during actual ratings was typically associated with the characteristics of the rated performances but also with other, more general and rater-related or situational factors. Other reasons external to the rating session were also identified for uncertainty, such as those related to the raters themselves. An analysis of the double-rated performances shows that although similar performance-related reasons seemed to cause uncertainty for different raters, their uncertainty was largely associated with different test-takers’ performances. While uncertainty can be seen as a natural part of holistic ratings in high-stakes tests, the study shows that even if uncertainty is not associated with the quality of ratings, we should constantly seek ways to address uncertainty in language testing, for example by developing rating scales and rater training. This may make raters’ work easier and less burdensome.

查看原文本刊更多论文

超越识字和能力——评分者感知的不确定性对写作评估的影响

这项研究调查了在写作表现评分之前、期间和之后，评分者在高风险测试中的不确定性经历有多普遍，这些不确定性的感觉是什么，以及这些感觉背后的原因是什么。我们还研究了不确定性是否与评级者的评级经验或评级质量有关。这些数据是从芬兰国家能力证书（一种标准化的芬兰高风险语言考试）的写作评分员（n=23）那里收集的。这些数据包括12118个评级，以及评级人员在评级会议期间的调查回复和笔记。回答采用主题内容分析法进行分析，评分采用描述性统计和多方面Rasch分析法。结果表明，不确定性是可变的和个体的，即使是经验丰富的评分者也会对自己的评分感到不确定。然而，不确定性与评级质量（一致性或严重性/宽大性）无关。不确定性也没有随着经验的增长而减少。实际评级期间的不确定性通常与评级表现的特征有关，但也与其他更普遍的、与评级者相关的或情境因素有关。评级会议之外的其他原因也被确定为不确定性，例如与评级机构本身有关的原因。对双评成绩的分析表明，尽管类似的成绩相关原因似乎会导致不同评分者的不确定性，但他们的不确定性在很大程度上与不同考生的成绩有关。虽然在高风险测试中，不确定性可以被视为整体评分的自然组成部分，但研究表明，即使不确定性与评分质量无关，我们也应该不断寻求解决语言测试中不确定性的方法，例如开发评分量表和评分员培训。这可能会使评分员的工作更容易，负担更少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Assessing Writing Multiple-

CiteScore

6.00

自引率

17.90%

发文量

期刊介绍： Assessing Writing is a refereed international journal providing a forum for ideas, research and practice on the assessment of written language. Assessing Writing publishes articles, book reviews, conference reports, and academic exchanges concerning writing assessments of all kinds, including traditional (direct and standardised forms of) testing of writing, alternative performance assessments (such as portfolios), workplace sampling and classroom assessment. The journal focuses on all stages of the writing assessment process, including needs evaluation, assessment creation, implementation, and validation, and test development.