{"title":"Evaluating the application of ChatGPT in China's residency training education: An exploratory study.","authors":"Luxiang Shang, Rui Li, Mingyue Xue, Qilong Guo, Yinglong Hou","doi":"10.1080/0142159X.2024.2377808","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The purpose of this study was to assess the utility of information generated by ChatGPT for residency education in China.</p><p><strong>Methods: </strong>We designed a three-step survey to evaluate the performance of ChatGPT in China's residency training education including residency final examination questions, patient cases, and resident satisfaction scores. First, 204 questions from the residency final exam were input into ChatGPT's interface to obtain the percentage of correct answers. Next, ChatGPT was asked to generate 20 clinical cases, which were subsequently evaluated by three instructors using a pre-designed Likert scale with 5 points. The quality of the cases was assessed based on criteria including clarity, relevance, logicality, credibility, and comprehensiveness. Finally, interaction sessions between 31 third-year residents and ChatGPT were conducted. Residents' perceptions of ChatGPT's feedback were assessed using a Likert scale, focusing on aspects such as ease of use, accuracy and completeness of responses, and its effectiveness in enhancing understanding of medical knowledge.</p><p><strong>Results: </strong>Our results showed ChatGPT-3.5 correctly answered 45.1% of exam questions. In the virtual patient cases, ChatGPT received mean ratings of 4.57 ± 0.50, 4.68 ± 0.47, 4.77 ± 0.46, 4.60 ± 0.53, and 3.95 ± 0.59 points for clarity, relevance, logicality, credibility, and comprehensiveness from clinical instructors, respectively. Among training residents, ChatGPT scored 4.48 ± 0.70, 4.00 ± 0.82 and 4.61 ± 0.50 points for ease of use, accuracy and completeness, and usefulness, respectively.</p><p><strong>Conclusion: </strong>Our findings demonstrate ChatGPT's immense potential for personalized Chinese medical education.</p>","PeriodicalId":18643,"journal":{"name":"Medical Teacher","volume":" ","pages":"858-864"},"PeriodicalIF":3.3000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Teacher","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/0142159X.2024.2377808","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The purpose of this study was to assess the utility of information generated by ChatGPT for residency education in China.
Methods: We designed a three-step survey to evaluate the performance of ChatGPT in China's residency training education including residency final examination questions, patient cases, and resident satisfaction scores. First, 204 questions from the residency final exam were input into ChatGPT's interface to obtain the percentage of correct answers. Next, ChatGPT was asked to generate 20 clinical cases, which were subsequently evaluated by three instructors using a pre-designed Likert scale with 5 points. The quality of the cases was assessed based on criteria including clarity, relevance, logicality, credibility, and comprehensiveness. Finally, interaction sessions between 31 third-year residents and ChatGPT were conducted. Residents' perceptions of ChatGPT's feedback were assessed using a Likert scale, focusing on aspects such as ease of use, accuracy and completeness of responses, and its effectiveness in enhancing understanding of medical knowledge.
Results: Our results showed ChatGPT-3.5 correctly answered 45.1% of exam questions. In the virtual patient cases, ChatGPT received mean ratings of 4.57 ± 0.50, 4.68 ± 0.47, 4.77 ± 0.46, 4.60 ± 0.53, and 3.95 ± 0.59 points for clarity, relevance, logicality, credibility, and comprehensiveness from clinical instructors, respectively. Among training residents, ChatGPT scored 4.48 ± 0.70, 4.00 ± 0.82 and 4.61 ± 0.50 points for ease of use, accuracy and completeness, and usefulness, respectively.
Conclusion: Our findings demonstrate ChatGPT's immense potential for personalized Chinese medical education.
期刊介绍:
Medical Teacher provides accounts of new teaching methods, guidance on structuring courses and assessing achievement, and serves as a forum for communication between medical teachers and those involved in general education. In particular, the journal recognizes the problems teachers have in keeping up-to-date with the developments in educational methods that lead to more effective teaching and learning at a time when the content of the curriculum—from medical procedures to policy changes in health care provision—is also changing. The journal features reports of innovation and research in medical education, case studies, survey articles, practical guidelines, reviews of current literature and book reviews. All articles are peer reviewed.