Effects of a rater training on rating accuracy in a physical examination skills assessment.

GMS Zeitschrift fur Medizinische Ausbildung Pub Date : 2014-11-17 eCollection Date: 2014-01-01 DOI:10.3205/zma000933
Gunther Weitz, Christian Vinzentius, Christoph Twesten, Hendrik Lehnert, Hendrik Bonnemeier, Inke R König
{"title":"Effects of a rater training on rating accuracy in a physical examination skills assessment.","authors":"Gunther Weitz,&nbsp;Christian Vinzentius,&nbsp;Christoph Twesten,&nbsp;Hendrik Lehnert,&nbsp;Hendrik Bonnemeier,&nbsp;Inke R König","doi":"10.3205/zma000933","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The accuracy and reproducibility of medical skills assessment is generally low. Rater training has little or no effect. Our knowledge in this field, however, relies on studies involving video ratings of overall clinical performances. We hypothesised that a rater training focussing on the frame of reference could improve accuracy in grading the curricular assessment of a highly standardised physical head-to-toe examination.</p><p><strong>Methods: </strong>Twenty-one raters assessed the performance of 242 third-year medical students. Eleven raters had been randomly assigned to undergo a brief frame-of-reference training a few days before the assessment. 218 encounters were successfully recorded on video and re-assessed independently by three additional observers. Accuracy was defined as the concordance between the raters' grade and the median of the observers' grade. After the assessment, both students and raters filled in a questionnaire about their views on the assessment.</p><p><strong>Results: </strong>Rater training did not have a measurable influence on accuracy. However, trained raters rated significantly more stringently than untrained raters, and their overall stringency was closer to the stringency of the observers. The questionnaire indicated a higher awareness of the halo effect in the trained raters group. Although the self-assessment of the students mirrored the assessment of the raters in both groups, the students assessed by trained raters felt more discontent with their grade.</p><p><strong>Conclusions: </strong>While training had some marginal effects, it failed to have an impact on the individual accuracy. These results in real-life encounters are consistent with previous studies on rater training using video assessments of clinical performances. The high degree of standardisation in this study was not suitable to harmonize the trained raters' grading. The data support the notion that the process of appraising medical performance is highly individual. A frame-of-reference training as applied does not effectively adjust the physicians' judgement on medical students in real-live assessments.</p>","PeriodicalId":30054,"journal":{"name":"GMS Zeitschrift fur Medizinische Ausbildung","volume":"31 4","pages":"Doc41"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3205/zma000933","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GMS Zeitschrift fur Medizinische Ausbildung","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3205/zma000933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2014/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Background: The accuracy and reproducibility of medical skills assessment is generally low. Rater training has little or no effect. Our knowledge in this field, however, relies on studies involving video ratings of overall clinical performances. We hypothesised that a rater training focussing on the frame of reference could improve accuracy in grading the curricular assessment of a highly standardised physical head-to-toe examination.

Methods: Twenty-one raters assessed the performance of 242 third-year medical students. Eleven raters had been randomly assigned to undergo a brief frame-of-reference training a few days before the assessment. 218 encounters were successfully recorded on video and re-assessed independently by three additional observers. Accuracy was defined as the concordance between the raters' grade and the median of the observers' grade. After the assessment, both students and raters filled in a questionnaire about their views on the assessment.

Results: Rater training did not have a measurable influence on accuracy. However, trained raters rated significantly more stringently than untrained raters, and their overall stringency was closer to the stringency of the observers. The questionnaire indicated a higher awareness of the halo effect in the trained raters group. Although the self-assessment of the students mirrored the assessment of the raters in both groups, the students assessed by trained raters felt more discontent with their grade.

Conclusions: While training had some marginal effects, it failed to have an impact on the individual accuracy. These results in real-life encounters are consistent with previous studies on rater training using video assessments of clinical performances. The high degree of standardisation in this study was not suitable to harmonize the trained raters' grading. The data support the notion that the process of appraising medical performance is highly individual. A frame-of-reference training as applied does not effectively adjust the physicians' judgement on medical students in real-live assessments.

Abstract Image

Abstract Image

Abstract Image

评定员训练对体格检查技能评定准确度的影响。
背景:医学技能评估的准确性和可重复性普遍较低。运动员训练几乎没有效果。然而,我们在这一领域的知识依赖于涉及整体临床表现的视频评分的研究。我们假设,在高度标准化的从头到脚的身体考试中,侧重于参考框架的评分培训可以提高课程评估评分的准确性。方法:采用21名评分员对242名医三学生的学业表现进行评估。11名评分员被随机分配,在评估前几天接受简短的参考框架培训。218次遭遇成功地录象,并由另外三名观察员独立地重新评估。准确度定义为评分者的等级与观察者的等级中位数之间的一致性。评估结束后,学生和评分员都填写了一份关于他们对评估的看法的问卷。结果:评分者训练对准确性没有可测量的影响。然而,经过训练的评分者的评分明显比未经训练的评分者严格,而且他们的总体严格程度更接近于观察者的严格程度。调查问卷显示,受过训练的评分者对光环效应有较高的认识。虽然学生的自我评价反映了两组评分员的评价,但由训练有素的评分员评估的学生对自己的成绩感到更不满。结论:训练虽然有一定的边际效应,但对个体的准确性没有影响。这些在现实生活中遇到的结果与先前使用视频评估临床表现的评估员培训的研究一致。本研究的标准化程度过高,不适合统一训练有素的评估师的评分。这些数据支持这样一种观点,即评估医疗绩效的过程是高度个人化的。应用参照系训练不能有效调整医师对医学生在实际评估中的判断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
25 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信