George Lam, Yusra Shammoon, Anna Coulson, Felicity Lalloo, Arti Maini, Anjali Amin, Celia Brown, Amir H Sam
{"title":"大型语言模型在创建临床评估项目中的实用性。","authors":"George Lam, Yusra Shammoon, Anna Coulson, Felicity Lalloo, Arti Maini, Anjali Amin, Celia Brown, Amir H Sam","doi":"10.1080/0142159X.2024.2382860","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To compare student performance, examiner perceptions and cost of GPT-assisted (generative pretrained transformer-assisted) clinical and professional skills assessment (CPSAs) items against items created using standard methods.</p><p><strong>Methods: </strong>We conducted a prospective, controlled, double-blinded comparison of CPSA items developed using GPT-assistance with those created through standard methods. Two sets of six practical cases were developed for a formative assessment sat by final year medical students. One clinical case in each set was created with GPT-assistance. Students were assigned to one of the two sets.</p><p><strong>Results: </strong>The results of 239 participants were analysed in the study. There was no statistically significant difference in item difficulty, or discriminative ability between GPT-assisted and standard items. One hundred percent (<i>n</i> = 15) of respondents to an examiner feedback questionnaire felt GPT-assisted cases were appropriately difficult and realistic. GPT-assistance resulted in significant labour cost savings, with a mean reduction of 57% (880 GBP) in labour cost per case when compared to standard case drafting methods.</p><p><strong>Conclusions: </strong>GPT-assistance can create CPSA items of comparable quality with significantly less cost when compared to standard methods. Future studies could evaluate GPT's ability to create CPSA material in other areas of clinical practice, aiming to validate the generalisability of these findings.</p>","PeriodicalId":18643,"journal":{"name":"Medical Teacher","volume":" ","pages":"878-882"},"PeriodicalIF":3.3000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Utility of large language models for creating clinical assessment items.\",\"authors\":\"George Lam, Yusra Shammoon, Anna Coulson, Felicity Lalloo, Arti Maini, Anjali Amin, Celia Brown, Amir H Sam\",\"doi\":\"10.1080/0142159X.2024.2382860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To compare student performance, examiner perceptions and cost of GPT-assisted (generative pretrained transformer-assisted) clinical and professional skills assessment (CPSAs) items against items created using standard methods.</p><p><strong>Methods: </strong>We conducted a prospective, controlled, double-blinded comparison of CPSA items developed using GPT-assistance with those created through standard methods. Two sets of six practical cases were developed for a formative assessment sat by final year medical students. One clinical case in each set was created with GPT-assistance. Students were assigned to one of the two sets.</p><p><strong>Results: </strong>The results of 239 participants were analysed in the study. There was no statistically significant difference in item difficulty, or discriminative ability between GPT-assisted and standard items. One hundred percent (<i>n</i> = 15) of respondents to an examiner feedback questionnaire felt GPT-assisted cases were appropriately difficult and realistic. GPT-assistance resulted in significant labour cost savings, with a mean reduction of 57% (880 GBP) in labour cost per case when compared to standard case drafting methods.</p><p><strong>Conclusions: </strong>GPT-assistance can create CPSA items of comparable quality with significantly less cost when compared to standard methods. Future studies could evaluate GPT's ability to create CPSA material in other areas of clinical practice, aiming to validate the generalisability of these findings.</p>\",\"PeriodicalId\":18643,\"journal\":{\"name\":\"Medical Teacher\",\"volume\":\" \",\"pages\":\"878-882\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical Teacher\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1080/0142159X.2024.2382860\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Teacher","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0142159X.2024.2382860","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
Utility of large language models for creating clinical assessment items.
Purpose: To compare student performance, examiner perceptions and cost of GPT-assisted (generative pretrained transformer-assisted) clinical and professional skills assessment (CPSAs) items against items created using standard methods.
Methods: We conducted a prospective, controlled, double-blinded comparison of CPSA items developed using GPT-assistance with those created through standard methods. Two sets of six practical cases were developed for a formative assessment sat by final year medical students. One clinical case in each set was created with GPT-assistance. Students were assigned to one of the two sets.
Results: The results of 239 participants were analysed in the study. There was no statistically significant difference in item difficulty, or discriminative ability between GPT-assisted and standard items. One hundred percent (n = 15) of respondents to an examiner feedback questionnaire felt GPT-assisted cases were appropriately difficult and realistic. GPT-assistance resulted in significant labour cost savings, with a mean reduction of 57% (880 GBP) in labour cost per case when compared to standard case drafting methods.
Conclusions: GPT-assistance can create CPSA items of comparable quality with significantly less cost when compared to standard methods. Future studies could evaluate GPT's ability to create CPSA material in other areas of clinical practice, aiming to validate the generalisability of these findings.
期刊介绍:
Medical Teacher provides accounts of new teaching methods, guidance on structuring courses and assessing achievement, and serves as a forum for communication between medical teachers and those involved in general education. In particular, the journal recognizes the problems teachers have in keeping up-to-date with the developments in educational methods that lead to more effective teaching and learning at a time when the content of the curriculum—from medical procedures to policy changes in health care provision—is also changing. The journal features reports of innovation and research in medical education, case studies, survey articles, practical guidelines, reviews of current literature and book reviews. All articles are peer reviewed.