评估法学硕士生成与专家创建的临床解剖学mcq:医学教育中基于学生感知的比较研究。

IF 3.8 2区 医学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Medical Education Online Pub Date : 2025-12-01 Epub Date: 2025-08-30 DOI:10.1080/10872981.2025.2554678
Maram Elzayyat, Janatul Naeim Mohammad, Sami Zaqout
{"title":"评估法学硕士生成与专家创建的临床解剖学mcq:医学教育中基于学生感知的比较研究。","authors":"Maram Elzayyat, Janatul Naeim Mohammad, Sami Zaqout","doi":"10.1080/10872981.2025.2554678","DOIUrl":null,"url":null,"abstract":"<p><p>Large language models (LLMs) such as ChatGPT and Gemini are increasingly used to generate educational content in medical education, including multiple-choice questions (MCQs), but their effectiveness compared to expert-written questions remains underexplored, particularly in anatomy. We conducted a cross-sectional, mixed-methods study involving Year 2-4 medical students at Qatar University, where participants completed and evaluated three anonymized MCQ sets-authored by ChatGPT, Google-Gemini, and a clinical anatomist-across 17 quality criteria. Descriptive and chi-square analyses were performed, and optional feedback was reviewed thematically. Among 48 participants, most rated the three MCQ sources as equally effective, although ChatGPT was more often preferred for helping students identify and confront their knowledge gaps through challenging distractors and diagnostic insight, while expert-written questions were rated highest for deeper analytical thinking. A significant variation in preferences was observed across sources (χ² (64) = 688.79, <i>p</i> < .001). Qualitative feedback emphasized the need for better difficulty calibration and clearer distractors in some AI-generated items. Overall, LLM-generated anatomy MCQs can closely match expert-authored ones in learner-perceived value and may support deeper engagement, but expert review remains critical to ensure clarity and alignment with curricular goals. A hybrid AI-human workflow may provide a promising path for scalable, high-quality assessment design in medical education.</p>","PeriodicalId":47656,"journal":{"name":"Medical Education Online","volume":"30 1","pages":"2554678"},"PeriodicalIF":3.8000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12404065/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.\",\"authors\":\"Maram Elzayyat, Janatul Naeim Mohammad, Sami Zaqout\",\"doi\":\"10.1080/10872981.2025.2554678\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Large language models (LLMs) such as ChatGPT and Gemini are increasingly used to generate educational content in medical education, including multiple-choice questions (MCQs), but their effectiveness compared to expert-written questions remains underexplored, particularly in anatomy. We conducted a cross-sectional, mixed-methods study involving Year 2-4 medical students at Qatar University, where participants completed and evaluated three anonymized MCQ sets-authored by ChatGPT, Google-Gemini, and a clinical anatomist-across 17 quality criteria. Descriptive and chi-square analyses were performed, and optional feedback was reviewed thematically. Among 48 participants, most rated the three MCQ sources as equally effective, although ChatGPT was more often preferred for helping students identify and confront their knowledge gaps through challenging distractors and diagnostic insight, while expert-written questions were rated highest for deeper analytical thinking. A significant variation in preferences was observed across sources (χ² (64) = 688.79, <i>p</i> < .001). Qualitative feedback emphasized the need for better difficulty calibration and clearer distractors in some AI-generated items. Overall, LLM-generated anatomy MCQs can closely match expert-authored ones in learner-perceived value and may support deeper engagement, but expert review remains critical to ensure clarity and alignment with curricular goals. A hybrid AI-human workflow may provide a promising path for scalable, high-quality assessment design in medical education.</p>\",\"PeriodicalId\":47656,\"journal\":{\"name\":\"Medical Education Online\",\"volume\":\"30 1\",\"pages\":\"2554678\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12404065/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Education Online\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/10872981.2025.2554678\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/30 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Education Online","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10872981.2025.2554678","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/30 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

摘要

ChatGPT和Gemini等大型语言模型(llm)越来越多地用于生成医学教育中的教育内容,包括多项选择题(mcq),但与专家撰写的问题相比,它们的有效性仍未得到充分探索,尤其是在解剖学方面。我们进行了一项横断面、混合方法研究,涉及卡塔尔大学2-4年级的医学生,参与者完成并评估了三个匿名MCQ集,由ChatGPT、Google-Gemini和一位临床解剖学家撰写,涉及17个质量标准。进行描述性和卡方分析,并对可选反馈进行主题审查。在48名参与者中,大多数人认为三个MCQ来源同样有效,尽管ChatGPT更倾向于帮助学生通过具有挑战性的干扰和诊断性的洞察力来识别和面对他们的知识差距,而专家撰写的问题在更深入的分析思维方面被评为最高。不同来源的偏好有显著差异(χ²(64)= 688.79,p < .001)。定性反馈强调在某些ai生成的道具中需要更好的难度校准和更清晰的干扰物。总体而言,法学硕士生成的解剖学mcq可以在学习者感知价值方面与专家撰写的mcq紧密匹配,并可能支持更深入的参与,但专家审查仍然是确保清晰和与课程目标一致的关键。人工智能-人类混合工作流程可能为医学教育中可扩展的高质量评估设计提供有前途的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.

Large language models (LLMs) such as ChatGPT and Gemini are increasingly used to generate educational content in medical education, including multiple-choice questions (MCQs), but their effectiveness compared to expert-written questions remains underexplored, particularly in anatomy. We conducted a cross-sectional, mixed-methods study involving Year 2-4 medical students at Qatar University, where participants completed and evaluated three anonymized MCQ sets-authored by ChatGPT, Google-Gemini, and a clinical anatomist-across 17 quality criteria. Descriptive and chi-square analyses were performed, and optional feedback was reviewed thematically. Among 48 participants, most rated the three MCQ sources as equally effective, although ChatGPT was more often preferred for helping students identify and confront their knowledge gaps through challenging distractors and diagnostic insight, while expert-written questions were rated highest for deeper analytical thinking. A significant variation in preferences was observed across sources (χ² (64) = 688.79, p < .001). Qualitative feedback emphasized the need for better difficulty calibration and clearer distractors in some AI-generated items. Overall, LLM-generated anatomy MCQs can closely match expert-authored ones in learner-perceived value and may support deeper engagement, but expert review remains critical to ensure clarity and alignment with curricular goals. A hybrid AI-human workflow may provide a promising path for scalable, high-quality assessment design in medical education.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Medical Education Online
Medical Education Online EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
6.00
自引率
2.20%
发文量
97
审稿时长
8 weeks
期刊介绍: Medical Education Online is an open access journal of health care education, publishing peer-reviewed research, perspectives, reviews, and early documentation of new ideas and trends. Medical Education Online aims to disseminate information on the education and training of physicians and other health care professionals. Manuscripts may address any aspect of health care education and training, including, but not limited to: -Basic science education -Clinical science education -Residency education -Learning theory -Problem-based learning (PBL) -Curriculum development -Research design and statistics -Measurement and evaluation -Faculty development -Informatics/web
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信