A Comparative Analysis of Large Language Model Accuracy for Image-Based Hair Disease Identification in Diverse Skin Tones

IF 2.5 4区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Willow D. Pastard MS , Willow Pastard MS , Zane Sejdiu BS , Alexis Arza BS , James Cross MBA , Razmig Garabet BS , Anna Chacon MD , Ellen N. Pritchett MD
{"title":"A Comparative Analysis of Large Language Model Accuracy for Image-Based Hair Disease Identification in Diverse Skin Tones","authors":"Willow D. Pastard MS ,&nbsp;Willow Pastard MS ,&nbsp;Zane Sejdiu BS ,&nbsp;Alexis Arza BS ,&nbsp;James Cross MBA ,&nbsp;Razmig Garabet BS ,&nbsp;Anna Chacon MD ,&nbsp;Ellen N. Pritchett MD","doi":"10.1016/j.jnma.2024.07.035","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>The rapid integration of artificial intelligence (AI) in dermatology shows promise for support of clinical practice and democratization of diagnosis access. Significant limitations and ethical concerns persist, however. Despite growing research into AI's effectiveness in identifying skin conditions, fewer studies have explored its ability to accurately diagnose hair disorders.</p></div><div><h3>Methods</h3><p>This study explores the capacity of the large language model (LLM) ChatGPT to correctly identify alopecia areata, androgenetic alopecia, traction alopecia, and central centrifugal cicatricial alopecia across a range of skin tones. Utilizing the Monk Skin Tone Scale, images of hair disorders were sorted into lighter (Monk Scale 1-5) and darker (Monk Scale 6-10) categories. Images were sourced from publicly accessible databases.</p></div><div><h3>Results</h3><p>Our analysis revealed significant differences in diagnosis rates. ChatGPT was more likely to correctly identify disease in lighter skin, notably for alopecia areata (p&lt;.001) and androgenetic alopecia (p=.003). This trend was also seen in overall diagnosis rates (p&lt;.001). Interestingly, the program repeatedly incorrectly identified 24.48% of all hair conditions in dark skin as traction alopecia. Additionally, while initially this study sought to explore ChatGPT's ability to diagnose common nail disorders across skin tones this could not be completed due to the insufficient availability of images depicting nail disorders in darker skin.</p></div><div><h3>Conclusion</h3><p>These findings highlight some of the limitations of LLMs in accurate diagnosis of diseases of the hair and nails. It emphasizes potential implications for the performance of artificial intelligence trained on dermatologic databases with limited representation.</p></div>","PeriodicalId":17369,"journal":{"name":"Journal of the National Medical Association","volume":"116 4","pages":"Page 426"},"PeriodicalIF":2.5000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the National Medical Association","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0027968424001160","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

The rapid integration of artificial intelligence (AI) in dermatology shows promise for support of clinical practice and democratization of diagnosis access. Significant limitations and ethical concerns persist, however. Despite growing research into AI's effectiveness in identifying skin conditions, fewer studies have explored its ability to accurately diagnose hair disorders.

Methods

This study explores the capacity of the large language model (LLM) ChatGPT to correctly identify alopecia areata, androgenetic alopecia, traction alopecia, and central centrifugal cicatricial alopecia across a range of skin tones. Utilizing the Monk Skin Tone Scale, images of hair disorders were sorted into lighter (Monk Scale 1-5) and darker (Monk Scale 6-10) categories. Images were sourced from publicly accessible databases.

Results

Our analysis revealed significant differences in diagnosis rates. ChatGPT was more likely to correctly identify disease in lighter skin, notably for alopecia areata (p<.001) and androgenetic alopecia (p=.003). This trend was also seen in overall diagnosis rates (p<.001). Interestingly, the program repeatedly incorrectly identified 24.48% of all hair conditions in dark skin as traction alopecia. Additionally, while initially this study sought to explore ChatGPT's ability to diagnose common nail disorders across skin tones this could not be completed due to the insufficient availability of images depicting nail disorders in darker skin.

Conclusion

These findings highlight some of the limitations of LLMs in accurate diagnosis of diseases of the hair and nails. It emphasizes potential implications for the performance of artificial intelligence trained on dermatologic databases with limited representation.

基于图像的不同肤色毛发疾病识别大语言模型准确性对比分析
目的 人工智能(AI)在皮肤病学领域的快速应用为临床实践提供了支持,也为诊断的民主化带来了希望。然而,人工智能仍然存在很大的局限性和伦理问题。本研究探讨了大型语言模型(LLM)ChatGPT 在各种肤色中正确识别斑秃、雄激素性脱发、牵引性脱发和中枢性离心环状脱发的能力。利用蒙克肤色量表,将毛发疾病的图片分为浅色(蒙克量表 1-5)和深色(蒙克量表 6-10)两类。结果我们的分析表明诊断率存在显著差异。ChatGPT 更有可能正确识别浅色皮肤的疾病,尤其是斑秃(p<.001)和雄激素性脱发(p=.003)。这一趋势也体现在总体诊断率上(p< .001)。有趣的是,该程序多次将 24.48% 的深色皮肤毛发病症错误地识别为牵引性脱发。此外,虽然本研究最初试图探索 ChatGPT 诊断不同肤色常见指甲疾病的能力,但由于描述深色皮肤指甲疾病的图像不足而未能完成。它强调了在代表性有限的皮肤病数据库中训练人工智能的潜在意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.80
自引率
3.00%
发文量
139
审稿时长
98 days
期刊介绍: Journal of the National Medical Association, the official journal of the National Medical Association, is a peer-reviewed publication whose purpose is to address medical care disparities of persons of African descent. The Journal of the National Medical Association is focused on specialized clinical research activities related to the health problems of African Americans and other minority groups. Special emphasis is placed on the application of medical science to improve the healthcare of underserved populations both in the United States and abroad. The Journal has the following objectives: (1) to expand the base of original peer-reviewed literature and the quality of that research on the topic of minority health; (2) to provide greater dissemination of this research; (3) to offer appropriate and timely recognition of the significant contributions of physicians who serve these populations; and (4) to promote engagement by member and non-member physicians in the overall goals and objectives of the National Medical Association.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信