{"title":"手部和上肢手术教育中的人工智能:chatgpt - 40作为学员学习工具的准确性和有效性对比","authors":"Caleb Bercu, Brianna Rosner, Aneeq Chaudhry, Hannah Korah, Isabel Bernal, Aaron Berger","doi":"","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The use of artificial intelligence (AI) in medical education has risen rapidly. Trainees can ask ChatGPT-4o (OpenAI) clinical questions and receive management recommendations. Previous studies have assessed the accuracy of ChatGPT, but none have examined hand and upper extremity surgery. This study aimed to evaluate the accuracy of ChatGPT-4o compared to UpToDate (Wolters Kluwer) and categorize the validity of sources provided by ChatGPT-4o.</p><p><strong>Methods: </strong>Five hand and upper extremity surgery cases were entered into ChatGPT-4o. An UpToDate article was selected for each case. Two hand surgeons and 5 medical students completed a survey comparing the resources. Resources were rated on a scale from 1 to 3, with 1 indicating incomplete information and not useful; 2 indicating semi-complete information and somewhat useful; and 3 indicating a complete answer and useful for management. ChatGPT-4o references were scored on a validity scale of 0 to 2.</p><p><strong>Results: </strong>Hand and upper extremity surgeons rated ChatGPT-4o and UpToDate as semi-complete and somewhat useful, with median scores of 2.00 and 2.50, respectively. No significant differences were found between resources. Medical students found ChatGPT to provide semi-complete information and be somewhat useful overall, and rated UpToDate more often as providing a complete answer and being useful. However, no statistically significant differences were found between the resource ratings. Of the 25 references provided by ChatGPT, 28% were accurate, 6% were somewhat accurate, and 66% were inaccurate.</p><p><strong>Conclusions: </strong>The findings indicate overall comparable perceived usefulness of ChatGPT-4o and UpToDate by hand/upper extremity surgeons and trainees. ChatGPT-4o holds promise as an educational tool; however, accuracy concerns remain.</p>","PeriodicalId":93993,"journal":{"name":"Eplasty","volume":"25 ","pages":"e17"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257968/pdf/","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence in Hand and Upper Extremity Surgery Education: Accuracy and Validity of ChatGPT-4o Versus UpToDate as a Learning Tool for Trainees.\",\"authors\":\"Caleb Bercu, Brianna Rosner, Aneeq Chaudhry, Hannah Korah, Isabel Bernal, Aaron Berger\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The use of artificial intelligence (AI) in medical education has risen rapidly. Trainees can ask ChatGPT-4o (OpenAI) clinical questions and receive management recommendations. Previous studies have assessed the accuracy of ChatGPT, but none have examined hand and upper extremity surgery. This study aimed to evaluate the accuracy of ChatGPT-4o compared to UpToDate (Wolters Kluwer) and categorize the validity of sources provided by ChatGPT-4o.</p><p><strong>Methods: </strong>Five hand and upper extremity surgery cases were entered into ChatGPT-4o. An UpToDate article was selected for each case. Two hand surgeons and 5 medical students completed a survey comparing the resources. Resources were rated on a scale from 1 to 3, with 1 indicating incomplete information and not useful; 2 indicating semi-complete information and somewhat useful; and 3 indicating a complete answer and useful for management. ChatGPT-4o references were scored on a validity scale of 0 to 2.</p><p><strong>Results: </strong>Hand and upper extremity surgeons rated ChatGPT-4o and UpToDate as semi-complete and somewhat useful, with median scores of 2.00 and 2.50, respectively. No significant differences were found between resources. Medical students found ChatGPT to provide semi-complete information and be somewhat useful overall, and rated UpToDate more often as providing a complete answer and being useful. However, no statistically significant differences were found between the resource ratings. Of the 25 references provided by ChatGPT, 28% were accurate, 6% were somewhat accurate, and 66% were inaccurate.</p><p><strong>Conclusions: </strong>The findings indicate overall comparable perceived usefulness of ChatGPT-4o and UpToDate by hand/upper extremity surgeons and trainees. ChatGPT-4o holds promise as an educational tool; however, accuracy concerns remain.</p>\",\"PeriodicalId\":93993,\"journal\":{\"name\":\"Eplasty\",\"volume\":\"25 \",\"pages\":\"e17\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257968/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eplasty\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eplasty","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
Artificial Intelligence in Hand and Upper Extremity Surgery Education: Accuracy and Validity of ChatGPT-4o Versus UpToDate as a Learning Tool for Trainees.
Background: The use of artificial intelligence (AI) in medical education has risen rapidly. Trainees can ask ChatGPT-4o (OpenAI) clinical questions and receive management recommendations. Previous studies have assessed the accuracy of ChatGPT, but none have examined hand and upper extremity surgery. This study aimed to evaluate the accuracy of ChatGPT-4o compared to UpToDate (Wolters Kluwer) and categorize the validity of sources provided by ChatGPT-4o.
Methods: Five hand and upper extremity surgery cases were entered into ChatGPT-4o. An UpToDate article was selected for each case. Two hand surgeons and 5 medical students completed a survey comparing the resources. Resources were rated on a scale from 1 to 3, with 1 indicating incomplete information and not useful; 2 indicating semi-complete information and somewhat useful; and 3 indicating a complete answer and useful for management. ChatGPT-4o references were scored on a validity scale of 0 to 2.
Results: Hand and upper extremity surgeons rated ChatGPT-4o and UpToDate as semi-complete and somewhat useful, with median scores of 2.00 and 2.50, respectively. No significant differences were found between resources. Medical students found ChatGPT to provide semi-complete information and be somewhat useful overall, and rated UpToDate more often as providing a complete answer and being useful. However, no statistically significant differences were found between the resource ratings. Of the 25 references provided by ChatGPT, 28% were accurate, 6% were somewhat accurate, and 66% were inaccurate.
Conclusions: The findings indicate overall comparable perceived usefulness of ChatGPT-4o and UpToDate by hand/upper extremity surgeons and trainees. ChatGPT-4o holds promise as an educational tool; however, accuracy concerns remain.