Deval Mehta, Clare Primiero, Brigid Betz-Stablein, Toan D Nguyen, Yaniv Gal, Adrian Bowling, Martin Haskett, Maithili Sashindranath, Paul Bonnington, Victoria Mar, H Peter Soyer, Zongyuan Ge
{"title":"Multi-task AI models in dermatology: Overcoming critical clinical translation challenges for enhanced skin lesion diagnosis.","authors":"Deval Mehta, Clare Primiero, Brigid Betz-Stablein, Toan D Nguyen, Yaniv Gal, Adrian Bowling, Martin Haskett, Maithili Sashindranath, Paul Bonnington, Victoria Mar, H Peter Soyer, Zongyuan Ge","doi":"10.1111/jdv.20551","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The surge in AI models for diagnosing skin lesions through image analysis is notable, yet their clinical implementation faces challenges. Common limitations include an over reliance on dermoscopy, lack of real-world applicability when only binary output (e.g. benign/malignant) is offered and low accuracy when faced with rare skin conditions.</p><p><strong>Objective: </strong>To address these common constraints associated with limited diagnostic output, and applicability to real-world settings.</p><p><strong>Methods: </strong>We developed an All-In-One Hierarchical-Out of Distribution-Clinical Triage (HOT) AI model for skin lesion analysis. Trained on a large dataset of ~208,000 lesion images, our HOT AI model generates three outputs: a hierarchical three-level prediction, an alert for out-of-distribution (OOD) images and a recommendation for dermoscopy to improve diagnostic prediction.</p><p><strong>Results: </strong>Our hierarchical prediction output provides a binary level 1 prediction (benign/malignant), Level 2 prediction of eight possible categories (e.g. melanocytic and keratinocytic) and a more definitive Level 3 prediction from 44 lesion categories. The model produced high sensitivity for Level 1 prediction (88.14% CI: 87.42-88.51); however, significantly lower for Level 3 prediction (63.90%, CI: 62.27-65.61). By relying on all three prediction levels for consensus, Level 1 false-positives were reduced by 20-25%, and false-negatives were decreased by 11-13% of cases. OOD detection was benchmarked against previous landmark models and outperformed comparative models. Lastly, 44% of images were recommended for dermoscopy, and with additional image input, Level 3 sensitivity increased from 48.13% (CI:45.08-49.57) to 52.54% (CI:50.25-55.04).</p><p><strong>Conclusion: </strong>Our HOT-AI model attempts to address common challenges in existing models by combining three tasks in one model to increase accuracy and clinical utility. By providing a more nuanced prediction, and alert for OOD, the model output provides greater explainability of the AI decision process. Prospective clinical testing is required to measure how this additional output impacts user trust, and how the model performs in a real-world setting.</p>","PeriodicalId":17351,"journal":{"name":"Journal of the European Academy of Dermatology and Venereology","volume":" ","pages":""},"PeriodicalIF":8.4000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the European Academy of Dermatology and Venereology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jdv.20551","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DERMATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The surge in AI models for diagnosing skin lesions through image analysis is notable, yet their clinical implementation faces challenges. Common limitations include an over reliance on dermoscopy, lack of real-world applicability when only binary output (e.g. benign/malignant) is offered and low accuracy when faced with rare skin conditions.
Objective: To address these common constraints associated with limited diagnostic output, and applicability to real-world settings.
Methods: We developed an All-In-One Hierarchical-Out of Distribution-Clinical Triage (HOT) AI model for skin lesion analysis. Trained on a large dataset of ~208,000 lesion images, our HOT AI model generates three outputs: a hierarchical three-level prediction, an alert for out-of-distribution (OOD) images and a recommendation for dermoscopy to improve diagnostic prediction.
Results: Our hierarchical prediction output provides a binary level 1 prediction (benign/malignant), Level 2 prediction of eight possible categories (e.g. melanocytic and keratinocytic) and a more definitive Level 3 prediction from 44 lesion categories. The model produced high sensitivity for Level 1 prediction (88.14% CI: 87.42-88.51); however, significantly lower for Level 3 prediction (63.90%, CI: 62.27-65.61). By relying on all three prediction levels for consensus, Level 1 false-positives were reduced by 20-25%, and false-negatives were decreased by 11-13% of cases. OOD detection was benchmarked against previous landmark models and outperformed comparative models. Lastly, 44% of images were recommended for dermoscopy, and with additional image input, Level 3 sensitivity increased from 48.13% (CI:45.08-49.57) to 52.54% (CI:50.25-55.04).
Conclusion: Our HOT-AI model attempts to address common challenges in existing models by combining three tasks in one model to increase accuracy and clinical utility. By providing a more nuanced prediction, and alert for OOD, the model output provides greater explainability of the AI decision process. Prospective clinical testing is required to measure how this additional output impacts user trust, and how the model performs in a real-world setting.
期刊介绍:
The Journal of the European Academy of Dermatology and Venereology (JEADV) is a publication that focuses on dermatology and venereology. It covers various topics within these fields, including both clinical and basic science subjects. The journal publishes articles in different formats, such as editorials, review articles, practice articles, original papers, short reports, letters to the editor, features, and announcements from the European Academy of Dermatology and Venereology (EADV).
The journal covers a wide range of keywords, including allergy, cancer, clinical medicine, cytokines, dermatology, drug reactions, hair disease, laser therapy, nail disease, oncology, skin cancer, skin disease, therapeutics, tumors, virus infections, and venereology.
The JEADV is indexed and abstracted by various databases and resources, including Abstracts on Hygiene & Communicable Diseases, Academic Search, AgBiotech News & Information, Botanical Pesticides, CAB Abstracts®, Embase, Global Health, InfoTrac, Ingenta Select, MEDLINE/PubMed, Science Citation Index Expanded, and others.