Can Large Language Models Replicate Systematic Review Outcome Classifications in Medical Education? A Pilot Study Using Kirkpatrick Levels.

IF 1.8 Q2 EDUCATION, SCIENTIFIC DISCIPLINES
Medical Science Educator Pub Date : 2026-01-16 eCollection Date: 2026-02-01 DOI:10.1007/s40670-026-02639-1
Giuliano Romano, Emilio Romano, Michelle Rau
{"title":"Can Large Language Models Replicate Systematic Review Outcome Classifications in Medical Education? A Pilot Study Using Kirkpatrick Levels.","authors":"Giuliano Romano, Emilio Romano, Michelle Rau","doi":"10.1007/s40670-026-02639-1","DOIUrl":null,"url":null,"abstract":"<p><p>Systematic reviews in medical education often classify outcomes using the Kirkpatrick framework, but manual coding is time-consuming and subjective. We conducted a proof-of-concept study testing ChatGPT (GPT-5, August 2025 release) on 32 full-text articles from a published systematic review of sepsis education. Agreement with human-coded outcomes was modest: 50% percent agreement, unweighted κ = 0.170 (95% CI 0.000-0.458), weighted κ = 0.351 (95% CI 0.074-0.629). Most disagreements were between adjacent levels.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s40670-026-02639-1.</p>","PeriodicalId":37113,"journal":{"name":"Medical Science Educator","volume":"36 1","pages":"11-15"},"PeriodicalIF":1.8000,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13043860/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Science Educator","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40670-026-02639-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0

Abstract

Systematic reviews in medical education often classify outcomes using the Kirkpatrick framework, but manual coding is time-consuming and subjective. We conducted a proof-of-concept study testing ChatGPT (GPT-5, August 2025 release) on 32 full-text articles from a published systematic review of sepsis education. Agreement with human-coded outcomes was modest: 50% percent agreement, unweighted κ = 0.170 (95% CI 0.000-0.458), weighted κ = 0.351 (95% CI 0.074-0.629). Most disagreements were between adjacent levels.

Supplementary information: The online version contains supplementary material available at 10.1007/s40670-026-02639-1.

大型语言模型能否复制医学教育系统评价结果分类?使用柯克帕特里克水平的试点研究。
医学教育中的系统评价通常使用Kirkpatrick框架对结果进行分类,但手工编码既耗时又主观。我们对ChatGPT (GPT-5, 2025年8月发布)进行了一项概念验证研究,测试了32篇来自已发表的败血症教育系统综述的全文文章。与人类编码结果的一致性不高:50%的一致性,未加权κ = 0.170 (95% CI 0.000-0.458),加权κ = 0.351 (95% CI 0.074-0.629)。大多数分歧发生在相邻级别之间。补充资料:在线版本提供补充资料,网址为10.1007/s40670-026-02639-1。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Medical Science Educator
Medical Science Educator Social Sciences-Education
CiteScore
2.90
自引率
11.80%
发文量
202
期刊介绍: Medical Science Educator is the successor of the journal JIAMSE. It is the peer-reviewed publication of the International Association of Medical Science Educators (IAMSE). The Journal offers all who teach in healthcare the most current information to succeed in their task by publishing scholarly activities, opinions, and resources in medical science education. Published articles focus on teaching the sciences fundamental to modern medicine and health, and include basic science education, clinical teaching, and the use of modern education technologies. The Journal provides the readership a better understanding of teaching and learning techniques in order to advance medical science education.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书