Validity and reliability of Brier scoring for assessment of probabilistic diagnostic reasoning.

IF 2.2 Q2 MEDICINE, GENERAL & INTERNAL
Diagnosis Pub Date : 2024-10-16 DOI:10.1515/dx-2023-0109
Nathan Stehouwer, Anastasia Rowland-Seymour, Larry Gruppen, Jeffrey M Albert, Kelli Qua
{"title":"Validity and reliability of Brier scoring for assessment of probabilistic diagnostic reasoning.","authors":"Nathan Stehouwer, Anastasia Rowland-Seymour, Larry Gruppen, Jeffrey M Albert, Kelli Qua","doi":"10.1515/dx-2023-0109","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Educators need tools for the assessment of clinical reasoning that reflect the ambiguity of real-world practice and measure learners' ability to determine diagnostic likelihood. In this study, the authors describe the use of the Brier score to assess and provide feedback on the quality of probabilistic diagnostic reasoning.</p><p><strong>Methods: </strong>The authors describe a novel format called Diagnostic Forecasting (DxF), in which participants read a brief clinical case and assign a probability to each item on a differential diagnosis, order tests and select a final diagnosis. DxF was piloted in a cohort of senior medical students. DxF evaluated students' answers with Brier scores, which compare probabilistic forecasts with case outcomes. The validity of Brier scores in DxF was assessed by comparison to subsequent decision-making in the game environment of DxF, as well as external criteria including medical knowledge tests and performance on clinical rotations.</p><p><strong>Results: </strong>Brier scores were statistically significantly correlated with diagnostic accuracy (95 % CI -4.4 to -0.44) and with mean scores on the National Board of Medical Examiners (NBME) shelf exams (95 % CI -474.6 to -225.1). Brier scores did not correlate with clerkship grades or performance on a structured clinical skills exam. Reliability as measured by within-student correlation was low.</p><p><strong>Conclusions: </strong>Brier scoring showed evidence for validity as a measurement of medical knowledge and predictor of clinical decision-making. Further work must evaluated the ability of Brier scores to predict clinical and workplace-based outcomes, and develop reliable approaches to measuring probabilistic reasoning.</p>","PeriodicalId":11273,"journal":{"name":"Diagnosis","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnosis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/dx-2023-0109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Educators need tools for the assessment of clinical reasoning that reflect the ambiguity of real-world practice and measure learners' ability to determine diagnostic likelihood. In this study, the authors describe the use of the Brier score to assess and provide feedback on the quality of probabilistic diagnostic reasoning.

Methods: The authors describe a novel format called Diagnostic Forecasting (DxF), in which participants read a brief clinical case and assign a probability to each item on a differential diagnosis, order tests and select a final diagnosis. DxF was piloted in a cohort of senior medical students. DxF evaluated students' answers with Brier scores, which compare probabilistic forecasts with case outcomes. The validity of Brier scores in DxF was assessed by comparison to subsequent decision-making in the game environment of DxF, as well as external criteria including medical knowledge tests and performance on clinical rotations.

Results: Brier scores were statistically significantly correlated with diagnostic accuracy (95 % CI -4.4 to -0.44) and with mean scores on the National Board of Medical Examiners (NBME) shelf exams (95 % CI -474.6 to -225.1). Brier scores did not correlate with clerkship grades or performance on a structured clinical skills exam. Reliability as measured by within-student correlation was low.

Conclusions: Brier scoring showed evidence for validity as a measurement of medical knowledge and predictor of clinical decision-making. Further work must evaluated the ability of Brier scores to predict clinical and workplace-based outcomes, and develop reliable approaches to measuring probabilistic reasoning.

用于评估概率诊断推理的布赖尔评分法的有效性和可靠性。
目标:教育工作者需要能反映真实世界实践中的模糊性并能衡量学习者判断诊断可能性的临床推理评估工具。在本研究中,作者介绍了如何使用布赖尔评分来评估和反馈概率诊断推理的质量:作者介绍了一种名为 "诊断预测"(DxF)的新颖形式,在这种形式中,参与者阅读一个简短的临床病例,并为鉴别诊断中的每个项目分配概率,下达检验单并选择最终诊断。DxF 在一批高年级医学生中进行了试点。DxF 采用布赖尔评分评估学生的答案,该评分将概率预测与病例结果进行比较。通过与 DxF 游戏环境中的后续决策以及包括医学知识测试和临床轮转表现在内的外部标准进行比较,评估了 DxF 中 Brier 分数的有效性:Brier 分数与诊断准确率(95 % CI -4.4--0.44)和美国国家医学考试委员会(NBME)架子考试的平均分数(95 % CI -474.6--225.1)有明显的统计学相关性。Brier 分数与实习成绩或结构化临床技能考试成绩没有相关性。以学生内部相关性衡量的可靠性较低:Brier 评分作为医学知识测量和临床决策预测指标的有效性得到了证实。进一步的工作必须评估布赖尔评分预测临床和工作场所结果的能力,并开发可靠的方法来测量概率推理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Diagnosis
Diagnosis MEDICINE, GENERAL & INTERNAL-
CiteScore
7.20
自引率
5.70%
发文量
41
期刊介绍: Diagnosis focuses on how diagnosis can be advanced, how it is taught, and how and why it can fail, leading to diagnostic errors. The journal welcomes both fundamental and applied works, improvement initiatives, opinions, and debates to encourage new thinking on improving this critical aspect of healthcare quality.  Topics: -Factors that promote diagnostic quality and safety -Clinical reasoning -Diagnostic errors in medicine -The factors that contribute to diagnostic error: human factors, cognitive issues, and system-related breakdowns -Improving the value of diagnosis – eliminating waste and unnecessary testing -How culture and removing blame promote awareness of diagnostic errors -Training and education related to clinical reasoning and diagnostic skills -Advances in laboratory testing and imaging that improve diagnostic capability -Local, national and international initiatives to reduce diagnostic error
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信