Does AI need anything more than a single image to diagnose melanoma?

IF 8.4 2区 医学 Q1 DERMATOLOGY
Aimilios Lallas, John Paoli
{"title":"Does AI need anything more than a single image to diagnose melanoma?","authors":"Aimilios Lallas,&nbsp;John Paoli","doi":"10.1111/jdv.20793","DOIUrl":null,"url":null,"abstract":"<p>Recent research consistently demonstrates the high accuracy of artificial intelligence (ΑΙ)-driven image analysis in diagnosing melanoma. The key benchmark for comparison has usually been the performance of human readers, who were outperformed by AI even in the initial experimental studies.<span><sup>1</sup></span></p><p>A significant drawback of these studies is their failure to replicate the clinical setting, due to the omission of parameters that are highly relevant in real-world scenarios.<span><sup>2</sup></span> Most studies used single clinical or dermoscopic images of lesions both for training algorithms and for evaluating the performance of algorithms and human raters. While this approach seems logical for an AI algorithm, it contrasts sharply with the practice of clinicians. Clinicians do not evaluate single images but examine unique individuals, considering a multitude of factors that contribute to a comprehensive evaluation. The clinical assessment encompasses factors such as phenotype, phototype, pigmentary trait, total lesion count, detailed analysis of lesion types and their distinct features and texture and review of their evolution history. The failure of previous studies to include these important parameters was one of the main limitations to the applicability of their findings in clinical practice.</p><p>The study by Kurtansky et al. represents one of the first efforts to integrate contextual information into the training and evaluation of AI algorithms for melanoma diagnosis.<span><sup>3</sup></span> It reports on the outcomes of the 2020 SIIM-ISIC Melanoma Classification Challenge, which saw participation from 3308 teams across 97 countries, submitting a total of 101,845 entries to the AI competition. Most importantly, this was the first initiative to employ a data set of patient-contextual lesion images to evaluate the influence of intrapatient lesion patterns on classifying melanoma. In the reader study, each index image was first assessed alone and then alongside seven additional dermoscopic images of nevi from the same patient.</p><p>The study reports two main findings. First, the top performing AI algorithm for melanoma diagnosis achieved an area under the receiver operating curve of 0.95. This result is consistent with trends of steadily improving algorithm performance in recent years, driven by the availability of larger training sets and ongoing advancements in deep learning techniques.</p><p>Second, the study found that including patient-contextual lesion images had no significant effect on the diagnostic accuracy, neither for the algorithms nor for the human readers. This result is somewhat unexpected and challenges the assumption that intra-patient lesion comparisons enhance diagnostic performance. Prior evidence suggests that melanoma detection can be enhanced by contextual information, as demonstrated by the comparative approach, an intrapatient assessment strategy.<span><sup>4</sup></span> Moreover, it is important to note that, in real-world clinical settings, clinicians operate with a specificity approaching 100%, due to their routine exposure to a much larger number of benign lesions. The lack of improvement in the diagnostic performance among human readers in the study may indicate that presenting seven images of single lesions offers an insufficient approximation of the broader visual context typically available in clinical practice. This stark contrast in diagnostic environment further underscores the challenges of replicating clinical realism in algorithm evaluation frameworks.</p><p>Although the AI algorithms in the study largely ignored contextual information, their diagnostic accuracy remained very high. However, for clinical deployment, algorithms must achieve specificity approaching 100% to avoid a surge in unnecessary excisions of benign lesions.<span><sup>5</sup></span> Consequently, more effective integration of contextual information is essential to improve specificity, and future research efforts should prioritize this objective. Furthermore, AI-based image analysis is increasingly applied not only for diagnostic classification but also for prognostic assessments and prediction of treatment response.<span><sup>6</sup></span> Incorporating diverse contextual variables, such as patient history, lesion evolution and broader skin patterns, would likely enhance the predictive power of these models across multiple tasks.</p><p>None.</p>","PeriodicalId":17351,"journal":{"name":"Journal of the European Academy of Dermatology and Venereology","volume":"39 8","pages":"1378-1379"},"PeriodicalIF":8.4000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jdv.20793","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the European Academy of Dermatology and Venereology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jdv.20793","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DERMATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Recent research consistently demonstrates the high accuracy of artificial intelligence (ΑΙ)-driven image analysis in diagnosing melanoma. The key benchmark for comparison has usually been the performance of human readers, who were outperformed by AI even in the initial experimental studies.1

A significant drawback of these studies is their failure to replicate the clinical setting, due to the omission of parameters that are highly relevant in real-world scenarios.2 Most studies used single clinical or dermoscopic images of lesions both for training algorithms and for evaluating the performance of algorithms and human raters. While this approach seems logical for an AI algorithm, it contrasts sharply with the practice of clinicians. Clinicians do not evaluate single images but examine unique individuals, considering a multitude of factors that contribute to a comprehensive evaluation. The clinical assessment encompasses factors such as phenotype, phototype, pigmentary trait, total lesion count, detailed analysis of lesion types and their distinct features and texture and review of their evolution history. The failure of previous studies to include these important parameters was one of the main limitations to the applicability of their findings in clinical practice.

The study by Kurtansky et al. represents one of the first efforts to integrate contextual information into the training and evaluation of AI algorithms for melanoma diagnosis.3 It reports on the outcomes of the 2020 SIIM-ISIC Melanoma Classification Challenge, which saw participation from 3308 teams across 97 countries, submitting a total of 101,845 entries to the AI competition. Most importantly, this was the first initiative to employ a data set of patient-contextual lesion images to evaluate the influence of intrapatient lesion patterns on classifying melanoma. In the reader study, each index image was first assessed alone and then alongside seven additional dermoscopic images of nevi from the same patient.

The study reports two main findings. First, the top performing AI algorithm for melanoma diagnosis achieved an area under the receiver operating curve of 0.95. This result is consistent with trends of steadily improving algorithm performance in recent years, driven by the availability of larger training sets and ongoing advancements in deep learning techniques.

Second, the study found that including patient-contextual lesion images had no significant effect on the diagnostic accuracy, neither for the algorithms nor for the human readers. This result is somewhat unexpected and challenges the assumption that intra-patient lesion comparisons enhance diagnostic performance. Prior evidence suggests that melanoma detection can be enhanced by contextual information, as demonstrated by the comparative approach, an intrapatient assessment strategy.4 Moreover, it is important to note that, in real-world clinical settings, clinicians operate with a specificity approaching 100%, due to their routine exposure to a much larger number of benign lesions. The lack of improvement in the diagnostic performance among human readers in the study may indicate that presenting seven images of single lesions offers an insufficient approximation of the broader visual context typically available in clinical practice. This stark contrast in diagnostic environment further underscores the challenges of replicating clinical realism in algorithm evaluation frameworks.

Although the AI algorithms in the study largely ignored contextual information, their diagnostic accuracy remained very high. However, for clinical deployment, algorithms must achieve specificity approaching 100% to avoid a surge in unnecessary excisions of benign lesions.5 Consequently, more effective integration of contextual information is essential to improve specificity, and future research efforts should prioritize this objective. Furthermore, AI-based image analysis is increasingly applied not only for diagnostic classification but also for prognostic assessments and prediction of treatment response.6 Incorporating diverse contextual variables, such as patient history, lesion evolution and broader skin patterns, would likely enhance the predictive power of these models across multiple tasks.

None.

人工智能诊断黑色素瘤需要的不仅仅是一张图像吗?
最近的研究一致表明,人工智能(ΑΙ)驱动的图像分析在诊断黑色素瘤方面具有很高的准确性。比较的关键基准通常是人类读者的表现,即使在最初的实验研究中,人类读者的表现也被人工智能超越了。这些研究的一个重大缺点是,由于忽略了与现实世界高度相关的参数,它们无法复制临床环境大多数研究使用单个临床或皮肤镜病变图像来训练算法和评估算法和人类评分者的性能。虽然这种方法对人工智能算法来说似乎合乎逻辑,但它与临床医生的做法形成鲜明对比。临床医生不评估单一的图像,但检查独特的个体,考虑多种因素,有助于一个全面的评估。临床评估包括表型、光型、色素性状、病变总数、病变类型及其独特特征和质地的详细分析以及对其演变历史的回顾等因素。以前的研究未能包括这些重要的参数,这是其研究结果在临床实践中适用性的主要限制之一。Kurtansky等人的研究代表了将上下文信息整合到黑色素瘤诊断人工智能算法的训练和评估中的首次努力之一它报告了2020年SIIM-ISIC黑色素瘤分类挑战赛的结果,该挑战赛共有来自97个国家的3308个团队参加,共向人工智能竞赛提交了101,845个参赛作品。最重要的是,这是首次采用患者背景病变图像数据集来评估患者内病变模式对黑色素瘤分类的影响。在读者研究中,每个索引图像首先单独评估,然后与来自同一患者的另外七张痣的皮肤镜图像一起评估。该研究报告了两个主要发现。首先,在黑色素瘤诊断方面,表现最好的人工智能算法在接受者工作曲线下的面积达到了0.95。这一结果与近年来算法性能稳步提高的趋势是一致的,这是由更大的训练集的可用性和深度学习技术的持续进步所驱动的。其次,研究发现,包括患者背景病变图像对诊断准确性没有显著影响,无论是对算法还是对人类读者。这一结果有些出乎意料,并挑战了患者内部病变比较可以提高诊断性能的假设。先前的证据表明,上下文信息可以增强黑素瘤的检测,正如比较方法所证明的那样,这是一种患者内部评估策略此外,值得注意的是,在现实世界的临床环境中,临床医生的特异性接近100%,因为他们经常接触到大量的良性病变。在研究中,人类读者的诊断性能缺乏改善,这可能表明,在临床实践中,呈现单个病变的七张图像不能提供更广泛的视觉背景。诊断环境中的这种鲜明对比进一步强调了在算法评估框架中复制临床现实主义的挑战。尽管研究中的人工智能算法在很大程度上忽略了上下文信息,但它们的诊断准确性仍然很高。然而,对于临床部署,算法必须达到接近100%的特异性,以避免不必要的良性病变切除激增因此,更有效地整合上下文信息对于提高特异性至关重要,未来的研究工作应优先考虑这一目标。此外,基于人工智能的图像分析不仅越来越多地用于诊断分类,而且还用于预后评估和治疗反应预测结合不同的背景变量,如患者病史、病变演变和更广泛的皮肤模式,可能会增强这些模型在多个任务中的预测能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.70
自引率
8.70%
发文量
874
审稿时长
3-6 weeks
期刊介绍: The Journal of the European Academy of Dermatology and Venereology (JEADV) is a publication that focuses on dermatology and venereology. It covers various topics within these fields, including both clinical and basic science subjects. The journal publishes articles in different formats, such as editorials, review articles, practice articles, original papers, short reports, letters to the editor, features, and announcements from the European Academy of Dermatology and Venereology (EADV). The journal covers a wide range of keywords, including allergy, cancer, clinical medicine, cytokines, dermatology, drug reactions, hair disease, laser therapy, nail disease, oncology, skin cancer, skin disease, therapeutics, tumors, virus infections, and venereology. The JEADV is indexed and abstracted by various databases and resources, including Abstracts on Hygiene & Communicable Diseases, Academic Search, AgBiotech News & Information, Botanical Pesticides, CAB Abstracts®, Embase, Global Health, InfoTrac, Ingenta Select, MEDLINE/PubMed, Science Citation Index Expanded, and others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信