Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology

Jacob P. S. Nielsen, Christian Grønhøj, L. Skov, M. Gyldenløve
{"title":"Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology","authors":"Jacob P. S. Nielsen, Christian Grønhøj, L. Skov, M. Gyldenløve","doi":"10.1002/jvc2.459","DOIUrl":null,"url":null,"abstract":"The field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre‐Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT‐4, capable of analyzing clinical images.To evaluate ChatGPT as a diagnostic tool and information source in clinical dermatology.A total of 15 clinical images were selected from the Danish web atlas, Danderm, depicting various common and rare skin conditions. The images were uploaded to ChatGPT version GPT‐4, which was prompted with ‘Please provide a description, a potential diagnosis, and treatment options for the following dermatological condition’. The generated responses were assessed by senior registrars in dermatology and consultant dermatologists in terms of accuracy, relevance, and depth (scale 1–5), and in addition, the image quality was rated (scale 0–10). Demographic and professional information about the respondents was registered.A total of 23 physicians participated in the study. The majority of the respondents were consultant dermatologists (83%), and 48% had more than 10 years of training. The overall image quality had a median rating of 10 out of 10 [interquartile range (IQR): 9–10]. The overall median rating of the ChatGPT generated responses was 2 (IQR: 1–4), while overall median ratings in terms of relevance, accuracy, and depth were 2 (IQR: 1–4), 3 (IQR: 2–4) and 2 (IQR: 1–3), respectively.Despite the advancements in ChatGPT, including newly added image processing capabilities, the chatbot demonstrated significant limitations in providing reliable and clinically useful responses to illustrative images of various dermatological conditions.","PeriodicalId":94325,"journal":{"name":"JEADV clinical practice","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JEADV clinical practice","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.1002/jvc2.459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre‐Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT‐4, capable of analyzing clinical images.To evaluate ChatGPT as a diagnostic tool and information source in clinical dermatology.A total of 15 clinical images were selected from the Danish web atlas, Danderm, depicting various common and rare skin conditions. The images were uploaded to ChatGPT version GPT‐4, which was prompted with ‘Please provide a description, a potential diagnosis, and treatment options for the following dermatological condition’. The generated responses were assessed by senior registrars in dermatology and consultant dermatologists in terms of accuracy, relevance, and depth (scale 1–5), and in addition, the image quality was rated (scale 0–10). Demographic and professional information about the respondents was registered.A total of 23 physicians participated in the study. The majority of the respondents were consultant dermatologists (83%), and 48% had more than 10 years of training. The overall image quality had a median rating of 10 out of 10 [interquartile range (IQR): 9–10]. The overall median rating of the ChatGPT generated responses was 2 (IQR: 1–4), while overall median ratings in terms of relevance, accuracy, and depth were 2 (IQR: 1–4), 3 (IQR: 2–4) and 2 (IQR: 1–3), respectively.Despite the advancements in ChatGPT, including newly added image processing capabilities, the chatbot demonstrated significant limitations in providing reliable and clinically useful responses to illustrative images of various dermatological conditions.
大型语言模型 ChatGPT(GPT-4)作为皮肤病学诊断工具和信息来源的实用性
人工智能领域发展迅速。为了评估 ChatGPT 在临床皮肤病学中作为诊断工具和信息来源的作用,我们从丹麦网络地图集 Danderm 中选取了 15 幅临床图片,描述了各种常见和罕见的皮肤病。这些图片被上传到 ChatGPT 版本 GPT-4 中,并提示 "请提供以下皮肤病的描述、潜在诊断和治疗方案"。生成的回答由皮肤科高级注册医师和皮肤科顾问医师从准确性、相关性和深度(1-5 级)方面进行评估,此外,还对图像质量进行了评分(0-10 级)。共有 23 名医生参与了这项研究。大部分受访者是皮肤科顾问医生(83%),48%的受访者接受过 10 年以上的培训。总体图像质量评分的中位数为 10 分(满分 10 分)[四分位数间距(IQR):9-10]。ChatGPT 生成的回复的总体评分中值为 2(IQR:1-4),而相关性、准确性和深度方面的总体评分中值分别为 2(IQR:1-4)、3(IQR:2-4)和 2(IQR:1-3)。尽管 ChatGPT 取得了进步,包括新增加了图像处理功能,但聊天机器人在为各种皮肤病的说明性图像提供可靠和临床有用的回复方面仍有很大的局限性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
0.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信