使用人工智能(GPT-4)生成的选择题对外科亚专科住院医师进行考试:可行性报告和心理测量分析。

IF 1.9 4区 医学 Q3 UROLOGY & NEPHROLOGY
Jin Kyu Kim, Michael Chua, Armando Lorenzo, Mandy Rickard, Laura Andreacchi, Michael Kim, Douglas Cheung, Yonah Krakowsky, Jason Y Lee
{"title":"使用人工智能(GPT-4)生成的选择题对外科亚专科住院医师进行考试:可行性报告和心理测量分析。","authors":"Jin Kyu Kim, Michael Chua, Armando Lorenzo, Mandy Rickard, Laura Andreacchi, Michael Kim, Douglas Cheung, Yonah Krakowsky, Jason Y Lee","doi":"10.5489/cuaj.9020","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Multiple-choice questions (MCQs) are essential in medical education and widely used by licensing bodies. They are traditionally created with intensive human effort to ensure validity. Recent advances in AI, particularly large language models (LLMs), offer the potential to streamline this process. This study aimed to develop and test a GPT-4 model with customized instructions for generating MCQs to assess urology residents.</p><p><strong>Methods: </strong>A GPT-4 model was embedded using guidelines from medical licensing bodies and reference materials specific to urology. This model was tasked with generating MCQs designed to mimic the format and content of the 2023 urology examination outlined by the Royal College of Physicians and Surgeons of Canada (RCPSC). Following generation, a selection of MCQs underwent expert review for validity and suitability.</p><p><strong>Results: </strong>From an initial set of 123 generated MCQs, 60 were chosen for inclusion in an exam administered to 15 urology residents at the University of Toronto. The exam results demonstrated a general increasing performance with level of training cohorts, suggesting the MCQs' ability to effectively discriminate knowledge levels among residents. The majority (33/60) of the questions had discriminatory value that appeared acceptable (discriminatory index 0.2-0.4) or excellent (discriminatory index >0.4).</p><p><strong>Conclusions: </strong>This study highlights AI-driven models like GPT-4 as efficient tools to aid with MCQ generation in medical education assessments. By automating MCQ creation while maintaining quality standards, AI can expedite processes. Future research should focus on refining AI applications in education to optimize assessments and enhance medical training and certification outcomes.</p>","PeriodicalId":50613,"journal":{"name":"Cuaj-Canadian Urological Association Journal","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Use of AI (GPT-4)-generated multiple-choice questions for the examination of surgical subspecialty residents: Report of feasibility and psychometric analysis.\",\"authors\":\"Jin Kyu Kim, Michael Chua, Armando Lorenzo, Mandy Rickard, Laura Andreacchi, Michael Kim, Douglas Cheung, Yonah Krakowsky, Jason Y Lee\",\"doi\":\"10.5489/cuaj.9020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Multiple-choice questions (MCQs) are essential in medical education and widely used by licensing bodies. They are traditionally created with intensive human effort to ensure validity. Recent advances in AI, particularly large language models (LLMs), offer the potential to streamline this process. This study aimed to develop and test a GPT-4 model with customized instructions for generating MCQs to assess urology residents.</p><p><strong>Methods: </strong>A GPT-4 model was embedded using guidelines from medical licensing bodies and reference materials specific to urology. This model was tasked with generating MCQs designed to mimic the format and content of the 2023 urology examination outlined by the Royal College of Physicians and Surgeons of Canada (RCPSC). Following generation, a selection of MCQs underwent expert review for validity and suitability.</p><p><strong>Results: </strong>From an initial set of 123 generated MCQs, 60 were chosen for inclusion in an exam administered to 15 urology residents at the University of Toronto. The exam results demonstrated a general increasing performance with level of training cohorts, suggesting the MCQs' ability to effectively discriminate knowledge levels among residents. The majority (33/60) of the questions had discriminatory value that appeared acceptable (discriminatory index 0.2-0.4) or excellent (discriminatory index >0.4).</p><p><strong>Conclusions: </strong>This study highlights AI-driven models like GPT-4 as efficient tools to aid with MCQ generation in medical education assessments. By automating MCQ creation while maintaining quality standards, AI can expedite processes. Future research should focus on refining AI applications in education to optimize assessments and enhance medical training and certification outcomes.</p>\",\"PeriodicalId\":50613,\"journal\":{\"name\":\"Cuaj-Canadian Urological Association Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cuaj-Canadian Urological Association Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.5489/cuaj.9020\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cuaj-Canadian Urological Association Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5489/cuaj.9020","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

多项选择题(mcq)在医学教育中是必不可少的,并被许可机构广泛使用。传统上,它们是通过大量的人力来确保有效性的。人工智能的最新进展,特别是大型语言模型(llm),提供了简化这一过程的潜力。本研究旨在开发和测试GPT-4模型,该模型具有定制的指令,用于生成mcq以评估泌尿外科住院医师。方法:使用来自医疗许可机构的指南和泌尿科特定参考材料嵌入GPT-4模型。该模型的任务是生成mcq,旨在模仿加拿大皇家内科医生和外科医生学院(RCPSC)概述的2023年泌尿科检查的格式和内容。下一代,选择mcq进行有效性和适用性的专家审查。结果:从最初产生的123个mcq中,选择了60个纳入多伦多大学泌尿科15名住院医师的考试。考试结果显示,随着培训水平的提高,总体表现有所提高,这表明mcq能够有效地区分居民的知识水平。大多数(33/60)的问题具有歧视值,表现为可接受(歧视指数0.2-0.4)或优秀(歧视指数>0.4)。结论:这项研究强调了人工智能驱动的模型,如GPT-4,是帮助医学教育评估中MCQ生成的有效工具。通过自动化MCQ创建,同时保持质量标准,AI可以加快流程。未来的研究应侧重于完善人工智能在教育中的应用,以优化评估,提高医疗培训和认证成果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Use of AI (GPT-4)-generated multiple-choice questions for the examination of surgical subspecialty residents: Report of feasibility and psychometric analysis.

Introduction: Multiple-choice questions (MCQs) are essential in medical education and widely used by licensing bodies. They are traditionally created with intensive human effort to ensure validity. Recent advances in AI, particularly large language models (LLMs), offer the potential to streamline this process. This study aimed to develop and test a GPT-4 model with customized instructions for generating MCQs to assess urology residents.

Methods: A GPT-4 model was embedded using guidelines from medical licensing bodies and reference materials specific to urology. This model was tasked with generating MCQs designed to mimic the format and content of the 2023 urology examination outlined by the Royal College of Physicians and Surgeons of Canada (RCPSC). Following generation, a selection of MCQs underwent expert review for validity and suitability.

Results: From an initial set of 123 generated MCQs, 60 were chosen for inclusion in an exam administered to 15 urology residents at the University of Toronto. The exam results demonstrated a general increasing performance with level of training cohorts, suggesting the MCQs' ability to effectively discriminate knowledge levels among residents. The majority (33/60) of the questions had discriminatory value that appeared acceptable (discriminatory index 0.2-0.4) or excellent (discriminatory index >0.4).

Conclusions: This study highlights AI-driven models like GPT-4 as efficient tools to aid with MCQ generation in medical education assessments. By automating MCQ creation while maintaining quality standards, AI can expedite processes. Future research should focus on refining AI applications in education to optimize assessments and enhance medical training and certification outcomes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cuaj-Canadian Urological Association Journal
Cuaj-Canadian Urological Association Journal 医学-泌尿学与肾脏学
CiteScore
2.80
自引率
10.50%
发文量
167
审稿时长
>12 weeks
期刊介绍: CUAJ is a a peer-reviewed, open-access journal devoted to promoting the highest standard of urological patient care through the publication of timely, relevant, evidence-based research and advocacy information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信