Tian Qingquan, Ren Feng, Zou Bin, Zhou Jingyu, Liu Ganglei, Zheng Yanwen, Zhang Zequn, Wang Qiyuan, Wang Shalong
{"title":"迭代改进ChatGPT在产生高质量的跨专业教育临床场景方面优于临床导师:一项比较研究。","authors":"Tian Qingquan, Ren Feng, Zou Bin, Zhou Jingyu, Liu Ganglei, Zheng Yanwen, Zhang Zequn, Wang Qiyuan, Wang Shalong","doi":"10.1186/s12909-025-07414-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Interprofessional education (IPE) is essential for promoting teamwork among healthcare professionals. However, its implementation is often hindered by the limited availability of interprofessional faculty and scheduling challenges in creating high-quality IPE scenarios. While AI tools like ChatGPT are increasingly being explored for this purpose, they have yet to demonstrate the ability to generate high-quality IPE scenarios, which remains a significant challenge. This study examines the effectiveness of GPT-4o, an advanced version of ChatGPT enhanced by novel methodologies, in overcoming these obstacles.</p><p><strong>Methods: </strong>This comparative study assessed clinical scenarios generated by GPT-4o using two strategies-standard prompt (a single-step scenario generation without iterative feedback) and iterative refinement (a multi-step, feedback-driven process)-against those crafted by clinical mentors. The iterative refinement method, inspired by actual clinical scenario development, employs a cyclical process of evaluation and refinement, closely mimicking discussions among professionals. Scenarios were evaluated for time efficiency and quality using the Interprofessional Quality Score (IQS), defined as the mean score assigned by multidisciplinary evaluators across five interprofessional criteria: clinical authenticity, team collaboration, educational alignment, appropriate challenge, and student engagement.</p><p><strong>Results: </strong>Scenarios developed using the iterative refinement strategy were completed significantly faster than those by clinical mentors and achieved higher or equivalent IQS. Notably, these scenarios matched or exceeded the quality of those created by humans, particularly in areas such as appropriate challenge and student engagement. Conversely, scenarios generated via the standard prompt method exhibited lower accuracy and various other deficiencies. Blinded attribution assessments by students further demonstrated that scenarios developed through iterative refinement were often indistinguishable from those created by human mentors.</p><p><strong>Conclusions: </strong>Employing GPT-4o with iterative refinement and role-playing strategies produces clinical scenarios that, in some areas, exceed those developed by clinical mentors. This approach reduces the need for extensive faculty involvement, highlighting AI's potential to closely align with established educational frameworks and substantially enhance IPE, particularly in resource-constrained settings.</p>","PeriodicalId":51234,"journal":{"name":"BMC Medical Education","volume":"25 1","pages":"845"},"PeriodicalIF":2.7000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12143027/pdf/","citationCount":"0","resultStr":"{\"title\":\"Iteratively refined ChatGPT outperforms clinical mentors in generating high-quality interprofessional education clinical scenarios: a comparative study.\",\"authors\":\"Tian Qingquan, Ren Feng, Zou Bin, Zhou Jingyu, Liu Ganglei, Zheng Yanwen, Zhang Zequn, Wang Qiyuan, Wang Shalong\",\"doi\":\"10.1186/s12909-025-07414-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Interprofessional education (IPE) is essential for promoting teamwork among healthcare professionals. However, its implementation is often hindered by the limited availability of interprofessional faculty and scheduling challenges in creating high-quality IPE scenarios. While AI tools like ChatGPT are increasingly being explored for this purpose, they have yet to demonstrate the ability to generate high-quality IPE scenarios, which remains a significant challenge. This study examines the effectiveness of GPT-4o, an advanced version of ChatGPT enhanced by novel methodologies, in overcoming these obstacles.</p><p><strong>Methods: </strong>This comparative study assessed clinical scenarios generated by GPT-4o using two strategies-standard prompt (a single-step scenario generation without iterative feedback) and iterative refinement (a multi-step, feedback-driven process)-against those crafted by clinical mentors. The iterative refinement method, inspired by actual clinical scenario development, employs a cyclical process of evaluation and refinement, closely mimicking discussions among professionals. Scenarios were evaluated for time efficiency and quality using the Interprofessional Quality Score (IQS), defined as the mean score assigned by multidisciplinary evaluators across five interprofessional criteria: clinical authenticity, team collaboration, educational alignment, appropriate challenge, and student engagement.</p><p><strong>Results: </strong>Scenarios developed using the iterative refinement strategy were completed significantly faster than those by clinical mentors and achieved higher or equivalent IQS. Notably, these scenarios matched or exceeded the quality of those created by humans, particularly in areas such as appropriate challenge and student engagement. Conversely, scenarios generated via the standard prompt method exhibited lower accuracy and various other deficiencies. Blinded attribution assessments by students further demonstrated that scenarios developed through iterative refinement were often indistinguishable from those created by human mentors.</p><p><strong>Conclusions: </strong>Employing GPT-4o with iterative refinement and role-playing strategies produces clinical scenarios that, in some areas, exceed those developed by clinical mentors. This approach reduces the need for extensive faculty involvement, highlighting AI's potential to closely align with established educational frameworks and substantially enhance IPE, particularly in resource-constrained settings.</p>\",\"PeriodicalId\":51234,\"journal\":{\"name\":\"BMC Medical Education\",\"volume\":\"25 1\",\"pages\":\"845\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12143027/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Education\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12909-025-07414-1\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Education","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12909-025-07414-1","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Iteratively refined ChatGPT outperforms clinical mentors in generating high-quality interprofessional education clinical scenarios: a comparative study.
Background: Interprofessional education (IPE) is essential for promoting teamwork among healthcare professionals. However, its implementation is often hindered by the limited availability of interprofessional faculty and scheduling challenges in creating high-quality IPE scenarios. While AI tools like ChatGPT are increasingly being explored for this purpose, they have yet to demonstrate the ability to generate high-quality IPE scenarios, which remains a significant challenge. This study examines the effectiveness of GPT-4o, an advanced version of ChatGPT enhanced by novel methodologies, in overcoming these obstacles.
Methods: This comparative study assessed clinical scenarios generated by GPT-4o using two strategies-standard prompt (a single-step scenario generation without iterative feedback) and iterative refinement (a multi-step, feedback-driven process)-against those crafted by clinical mentors. The iterative refinement method, inspired by actual clinical scenario development, employs a cyclical process of evaluation and refinement, closely mimicking discussions among professionals. Scenarios were evaluated for time efficiency and quality using the Interprofessional Quality Score (IQS), defined as the mean score assigned by multidisciplinary evaluators across five interprofessional criteria: clinical authenticity, team collaboration, educational alignment, appropriate challenge, and student engagement.
Results: Scenarios developed using the iterative refinement strategy were completed significantly faster than those by clinical mentors and achieved higher or equivalent IQS. Notably, these scenarios matched or exceeded the quality of those created by humans, particularly in areas such as appropriate challenge and student engagement. Conversely, scenarios generated via the standard prompt method exhibited lower accuracy and various other deficiencies. Blinded attribution assessments by students further demonstrated that scenarios developed through iterative refinement were often indistinguishable from those created by human mentors.
Conclusions: Employing GPT-4o with iterative refinement and role-playing strategies produces clinical scenarios that, in some areas, exceed those developed by clinical mentors. This approach reduces the need for extensive faculty involvement, highlighting AI's potential to closely align with established educational frameworks and substantially enhance IPE, particularly in resource-constrained settings.
期刊介绍:
BMC Medical Education is an open access journal publishing original peer-reviewed research articles in relation to the training of healthcare professionals, including undergraduate, postgraduate, and continuing education. The journal has a special focus on curriculum development, evaluations of performance, assessment of training needs and evidence-based medicine.