Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland
{"title":"ChatGPT 4.0在非创伤性手部疾病自我诊断中的疗效。","authors":"Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland","doi":"10.1016/j.jham.2025.100217","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.</p><p><strong>Methods: </strong>Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.</p><p><strong>Results: </strong>GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.</p><p><strong>Conclusions: </strong>The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.</p>","PeriodicalId":45368,"journal":{"name":"Journal of Hand and Microsurgery","volume":"17 3","pages":"100217"},"PeriodicalIF":0.5000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849648/pdf/","citationCount":"0","resultStr":"{\"title\":\"ChatGPT 4.0's efficacy in the self-diagnosis of non-traumatic hand conditions.\",\"authors\":\"Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland\",\"doi\":\"10.1016/j.jham.2025.100217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.</p><p><strong>Methods: </strong>Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.</p><p><strong>Results: </strong>GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.</p><p><strong>Conclusions: </strong>The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.</p>\",\"PeriodicalId\":45368,\"journal\":{\"name\":\"Journal of Hand and Microsurgery\",\"volume\":\"17 3\",\"pages\":\"100217\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849648/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Hand and Microsurgery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jham.2025.100217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q4\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hand and Microsurgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jham.2025.100217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"SURGERY","Score":null,"Total":0}
ChatGPT 4.0's efficacy in the self-diagnosis of non-traumatic hand conditions.
Background: With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.
Methods: Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.
Results: GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.
Conclusions: The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.