{"title":"Deep Learning Models for Predicting Malignancy Risk in CT-Detected Pulmonary Nodules: A Systematic Review and Meta-analysis.","authors":"Wahyu Wulaningsih, Carmela Villamaria, Abdullah Akram, Janella Benemile, Filippo Croce, Johnathan Watkins","doi":"10.1007/s00408-024-00706-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There has been growing interest in using artificial intelligence/deep learning (DL) to help diagnose prevalent diseases earlier. In this study we sought to survey the landscape of externally validated DL-based computer-aided diagnostic (CADx) models, and assess their diagnostic performance for predicting the risk of malignancy in computed tomography (CT)-detected pulmonary nodules.</p><p><strong>Methods: </strong>An electronic search was performed in four databases (from inception to 10 August 2023). Studies were eligible if they were peer-reviewed experimental or observational articles comparing the diagnostic performance of externally validated DL-based CADx models with models widely used in clinical practice to predict the risk of malignancy. A bivariate random-effect approach for the meta-analysis on the included studies was used.</p><p><strong>Results: </strong>Seventeen studies were included, comprising 8553 participants and 9884 nodules. Pooled analyses showed DL-based CADx models were 11.6% more sensitive than physician judgement alone, and 14.5% more than clinical risk models alone. They had a similar pooled specificity to physician judgement alone [0.77 (95% CI 0.68-0.84) v 0.81 (95% CI 0.71-0.88)], and were 7.4% more specific than clinical risk models alone. They had superior pooled areas under the receiver operating curve (AUC), with relative pooled AUCs of 1.03 (95% CI 1.00-1.07) and 1.10 (95% CI 1.07-1.13) versus physician judgement and clinical risk models alone, respectively.</p><p><strong>Conclusion: </strong>DL-based models are already used in clinical practice in certain settings for nodule management. Our results show their diagnostic performance potentially justifies wider, more routine deployment alongside experienced physician readers to help inform multidisciplinary team decision-making.</p>","PeriodicalId":18163,"journal":{"name":"Lung","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11427562/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lung","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00408-024-00706-1","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/23 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}
引用次数: 0
Abstract
Background: There has been growing interest in using artificial intelligence/deep learning (DL) to help diagnose prevalent diseases earlier. In this study we sought to survey the landscape of externally validated DL-based computer-aided diagnostic (CADx) models, and assess their diagnostic performance for predicting the risk of malignancy in computed tomography (CT)-detected pulmonary nodules.
Methods: An electronic search was performed in four databases (from inception to 10 August 2023). Studies were eligible if they were peer-reviewed experimental or observational articles comparing the diagnostic performance of externally validated DL-based CADx models with models widely used in clinical practice to predict the risk of malignancy. A bivariate random-effect approach for the meta-analysis on the included studies was used.
Results: Seventeen studies were included, comprising 8553 participants and 9884 nodules. Pooled analyses showed DL-based CADx models were 11.6% more sensitive than physician judgement alone, and 14.5% more than clinical risk models alone. They had a similar pooled specificity to physician judgement alone [0.77 (95% CI 0.68-0.84) v 0.81 (95% CI 0.71-0.88)], and were 7.4% more specific than clinical risk models alone. They had superior pooled areas under the receiver operating curve (AUC), with relative pooled AUCs of 1.03 (95% CI 1.00-1.07) and 1.10 (95% CI 1.07-1.13) versus physician judgement and clinical risk models alone, respectively.
Conclusion: DL-based models are already used in clinical practice in certain settings for nodule management. Our results show their diagnostic performance potentially justifies wider, more routine deployment alongside experienced physician readers to help inform multidisciplinary team decision-making.
期刊介绍:
Lung publishes original articles, reviews and editorials on all aspects of the healthy and diseased lungs, of the airways, and of breathing. Epidemiological, clinical, pathophysiological, biochemical, and pharmacological studies fall within the scope of the journal. Case reports, short communications and technical notes can be accepted if they are of particular interest.