介绍人工智能作为文字一致性测试专家参考小组的成员:比较分析。

IF 3.3 2区 教育学 Q1 EDUCATION, SCIENTIFIC DISCIPLINES
Moataz A Sallam, Enjy Abouzeid
{"title":"介绍人工智能作为文字一致性测试专家参考小组的成员:比较分析。","authors":"Moataz A Sallam, Enjy Abouzeid","doi":"10.1080/0142159X.2025.2473620","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The Script Concordance Test (SCT) is increasingly used in professional development to assess clinical reasoning, with linear progression in SCT performance observed as clinical experience increases. One challenge in implementing SCT is the potential burnout of expert reference panel (ERP) members. To address this, we introduced ChatGPT as panel members. The aim was to enhance the efficiency of SCT creation while maintaining educational content quality and to explore the effectiveness of different models as reference panels.</p><p><strong>Methodology: </strong>A quasi-experimental comparative design was employed, involving all undergraduate medical students and faculty members enrolled in the Ophthalmology clerkship. Two groups involved Traditional ERP which consisted of 15 experts, diversified in clinical experience: 5 senior residents, 5 lecturers, and 5 professors and AI-Generated ERP which is a panel generated using ChatGPT and o1 preview, designed to mirror diverse clinical opinions based on varying experience levels.</p><p><strong>Results: </strong>Experts consistently achieved the highest mean scores across most vignettes, with ChatGPT-4 and o1 scores generally slightly lower. Notably, the o1 mean scores were closer to those of experts compared to ChatGPT-4. Significant differences were observed between ChatGPT-4 and o1 scores in certain vignettes. These values indicate a strong level of consistency, suggesting that both experts and AI models provided highly reliable ratings.</p><p><strong>Conclusion: </strong>These findings suggest that while AI models cannot replace human experts, they can be effectively used to train students, enhance reasoning skills, and help narrow the gap between student and expert performance.</p>","PeriodicalId":18643,"journal":{"name":"Medical Teacher","volume":" ","pages":"1-8"},"PeriodicalIF":3.3000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Introducing AI as members of script concordance test expert reference panel: A comparative analysis.\",\"authors\":\"Moataz A Sallam, Enjy Abouzeid\",\"doi\":\"10.1080/0142159X.2025.2473620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The Script Concordance Test (SCT) is increasingly used in professional development to assess clinical reasoning, with linear progression in SCT performance observed as clinical experience increases. One challenge in implementing SCT is the potential burnout of expert reference panel (ERP) members. To address this, we introduced ChatGPT as panel members. The aim was to enhance the efficiency of SCT creation while maintaining educational content quality and to explore the effectiveness of different models as reference panels.</p><p><strong>Methodology: </strong>A quasi-experimental comparative design was employed, involving all undergraduate medical students and faculty members enrolled in the Ophthalmology clerkship. Two groups involved Traditional ERP which consisted of 15 experts, diversified in clinical experience: 5 senior residents, 5 lecturers, and 5 professors and AI-Generated ERP which is a panel generated using ChatGPT and o1 preview, designed to mirror diverse clinical opinions based on varying experience levels.</p><p><strong>Results: </strong>Experts consistently achieved the highest mean scores across most vignettes, with ChatGPT-4 and o1 scores generally slightly lower. Notably, the o1 mean scores were closer to those of experts compared to ChatGPT-4. Significant differences were observed between ChatGPT-4 and o1 scores in certain vignettes. These values indicate a strong level of consistency, suggesting that both experts and AI models provided highly reliable ratings.</p><p><strong>Conclusion: </strong>These findings suggest that while AI models cannot replace human experts, they can be effectively used to train students, enhance reasoning skills, and help narrow the gap between student and expert performance.</p>\",\"PeriodicalId\":18643,\"journal\":{\"name\":\"Medical Teacher\",\"volume\":\" \",\"pages\":\"1-8\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Teacher\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1080/0142159X.2025.2473620\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Teacher","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0142159X.2025.2473620","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0

摘要

背景:文字一致性测试(SCT)在专业发展中越来越多地用于评估临床推理,随着临床经验的增加,SCT表现呈线性进展。实施SCT的一个挑战是专家参考小组(ERP)成员的潜在倦怠。为了解决这个问题,我们引入了ChatGPT作为小组成员。目的是在保持教育内容质量的同时,提高SCT创作的效率,并探讨不同模式作为参考面板的有效性。方法:采用准实验比较设计,研究对象为所有在校生和眼科实习教师。两组包括由15名专家组成的传统ERP,临床经验多样化:5名高级住院医师,5名讲师,5名教授;AI-Generated ERP是使用ChatGPT和01预览生成的面板,旨在反映基于不同经验水平的不同临床意见。结果:专家在大多数小片段中始终获得最高的平均分数,ChatGPT-4和o1分数通常略低。值得注意的是,与ChatGPT-4相比,01的平均得分更接近专家的得分。在某些小片段中,ChatGPT-4和01评分之间存在显著差异。这些值表明了很强的一致性,表明专家和人工智能模型都提供了高度可靠的评级。结论:这些发现表明,虽然人工智能模型不能取代人类专家,但它们可以有效地用于训练学生,提高推理能力,并有助于缩小学生与专家之间的差距。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Introducing AI as members of script concordance test expert reference panel: A comparative analysis.

Background: The Script Concordance Test (SCT) is increasingly used in professional development to assess clinical reasoning, with linear progression in SCT performance observed as clinical experience increases. One challenge in implementing SCT is the potential burnout of expert reference panel (ERP) members. To address this, we introduced ChatGPT as panel members. The aim was to enhance the efficiency of SCT creation while maintaining educational content quality and to explore the effectiveness of different models as reference panels.

Methodology: A quasi-experimental comparative design was employed, involving all undergraduate medical students and faculty members enrolled in the Ophthalmology clerkship. Two groups involved Traditional ERP which consisted of 15 experts, diversified in clinical experience: 5 senior residents, 5 lecturers, and 5 professors and AI-Generated ERP which is a panel generated using ChatGPT and o1 preview, designed to mirror diverse clinical opinions based on varying experience levels.

Results: Experts consistently achieved the highest mean scores across most vignettes, with ChatGPT-4 and o1 scores generally slightly lower. Notably, the o1 mean scores were closer to those of experts compared to ChatGPT-4. Significant differences were observed between ChatGPT-4 and o1 scores in certain vignettes. These values indicate a strong level of consistency, suggesting that both experts and AI models provided highly reliable ratings.

Conclusion: These findings suggest that while AI models cannot replace human experts, they can be effectively used to train students, enhance reasoning skills, and help narrow the gap between student and expert performance.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Medical Teacher
Medical Teacher 医学-卫生保健
CiteScore
7.80
自引率
8.50%
发文量
396
审稿时长
3-6 weeks
期刊介绍: Medical Teacher provides accounts of new teaching methods, guidance on structuring courses and assessing achievement, and serves as a forum for communication between medical teachers and those involved in general education. In particular, the journal recognizes the problems teachers have in keeping up-to-date with the developments in educational methods that lead to more effective teaching and learning at a time when the content of the curriculum—from medical procedures to policy changes in health care provision—is also changing. The journal features reports of innovation and research in medical education, case studies, survey articles, practical guidelines, reviews of current literature and book reviews. All articles are peer reviewed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信