ChatGPT 4.0在非创伤性手部疾病自我诊断中的疗效。

IF 0.5 Q4 SURGERY
Journal of Hand and Microsurgery Pub Date : 2025-01-23 eCollection Date: 2025-05-01 DOI:10.1016/j.jham.2025.100217
Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland
{"title":"ChatGPT 4.0在非创伤性手部疾病自我诊断中的疗效。","authors":"Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland","doi":"10.1016/j.jham.2025.100217","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.</p><p><strong>Methods: </strong>Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.</p><p><strong>Results: </strong>GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.</p><p><strong>Conclusions: </strong>The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.</p>","PeriodicalId":45368,"journal":{"name":"Journal of Hand and Microsurgery","volume":"17 3","pages":"100217"},"PeriodicalIF":0.5000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849648/pdf/","citationCount":"0","resultStr":"{\"title\":\"ChatGPT 4.0's efficacy in the self-diagnosis of non-traumatic hand conditions.\",\"authors\":\"Krishna D Unadkat, Isra Abdulwadood, Annika N Hiredesai, Carina P Howlett, Laura E Geldmaker, Shelley S Noland\",\"doi\":\"10.1016/j.jham.2025.100217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.</p><p><strong>Methods: </strong>Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.</p><p><strong>Results: </strong>GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.</p><p><strong>Conclusions: </strong>The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.</p>\",\"PeriodicalId\":45368,\"journal\":{\"name\":\"Journal of Hand and Microsurgery\",\"volume\":\"17 3\",\"pages\":\"100217\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11849648/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Hand and Microsurgery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jham.2025.100217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q4\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hand and Microsurgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jham.2025.100217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

背景:随着人工智能的发展,患者越来越多地转向ChatGPT等生成式人工智能模型寻求医疗建议。本研究探讨了ChatGPT 4.0 (GPT-4.0)的效用,ChatGPT的最新版本,作为普通手部疾病的临时诊断专家。其次,该研究通过评估GPT-4.0从患者角度产生特定病症问题的能力,评估了与每种病症相关的术语。方法:确定5种常见的手部疾病:扳机指(TF)、Dupuytren's挛缩(DC)、腕管综合征(CTS)、de Quervain's腱鞘炎(DQT)和拇指腕掌骨关节炎(CMC)。GPT-4.0采用作者自编问题进行查询。记录正确诊断、鉴别诊断和建议的频率。采用卡方检验和两两Fisher精确检验比较不同条件下的反应准确性。GPT-4.0被要求提出自己的问题。记录回答中常见的术语。结果:GPT-4.0的诊断正确率在不同条件下差异显著(p < 95%正确率,60% (n = 15)的CMC查询正确率)。此外,在提供鉴别诊断方面存在显著差异(p)。结论:结果表明GPT-4.0具有潜在的初步诊断效用。未来的研究应进一步调查提高或恶化人工智能诊断能力的因素,并考虑患者使用的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ChatGPT 4.0's efficacy in the self-diagnosis of non-traumatic hand conditions.

Background: With advancements in artificial intelligence, patients increasingly turn to generative AI models like ChatGPT for medical advice. This study explores the utility of ChatGPT 4.0 (GPT-4.0), the most recent version of ChatGPT, as an interim diagnostician for common hand conditions. Secondarily, the study evaluates the terminology GPT-4.0 associates with each condition by assessing its ability to generate condition-specific questions from a patient's perspective.

Methods: Five common hand conditions were identified: trigger finger (TF), Dupuytren's Contracture (DC), carpal tunnel syndrome (CTS), de Quervain's tenosynovitis (DQT), and thumb carpometacarpal osteoarthritis (CMC). GPT-4.0 was queried with author-generated questions. The frequency of correct diagnoses, differential diagnoses, and recommendations were recorded. Chi-squared and pairwise Fisher's exact tests were used to compare response accuracy between conditions. GPT-4.0 was prompted to produce its own questions. Common terms in responses were recorded.

Results: GPT-4.0's diagnostic accuracy significantly differed between conditions (p < 0.005). While GPT-4.0 diagnosed CTS, TF, DQT, and DC with >95 % accuracy, 60 % (n = 15) of CMC queries were correctly diagnosed. Additionally, there were significant differences in providing of differential diagnoses (p < 0.005), diagnostic tests (p < 0.005), and risk factors (p < 0.05). GPT-4.0 recommended visiting a healthcare provider for 97 % (n = 121) of the questions. Analysis of ChatGPT-generated questions showed four of the ten most used terms were shared between DQT and CMC.

Conclusions: The results suggest that GPT-4.0 has potential preliminary diagnostic utility. Future studies should further investigate factors that improve or worsen AI's diagnostic power and consider the implications of patient utilization.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.00
自引率
25.00%
发文量
39
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信