Accuracy of machine learning models for pre-diagnosis and diagnosis of pancreatic ductal adenocarcinoma in contrast-CT images: a systematic review and meta-analysis.
IF 2.3 3区 医学Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Geraldo Lucas Lopes Costa, Guido Tasca Petroski, Luis Guilherme Machado, Bruno Eulalio Santos, Fernanda de Oliveira Ramos, Leo Max Feuerschuette Neto, Graziela De Luca Canto
{"title":"Accuracy of machine learning models for pre-diagnosis and diagnosis of pancreatic ductal adenocarcinoma in contrast-CT images: a systematic review and meta-analysis.","authors":"Geraldo Lucas Lopes Costa, Guido Tasca Petroski, Luis Guilherme Machado, Bruno Eulalio Santos, Fernanda de Oliveira Ramos, Leo Max Feuerschuette Neto, Graziela De Luca Canto","doi":"10.1007/s00261-024-04771-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the diagnostic ability and methodological quality of ML models in detecting Pancreatic Ductal Adenocarcinoma (PDAC) in Contrast CT images.</p><p><strong>Method: </strong>Included studies assessed adults diagnosed with PDAC, confirmed by histopathology. Metrics of tests were interpreted by ML algorithms. Studies provided data on sensitivity and specificity. Studies that did not meet the inclusion criteria, segmentation-focused studies, multiple classifiers or non-diagnostic studies were excluded. PubMed, Cochrane Central Register of Controlled Trials, and Embase were searched without restrictions. Risk of bias was assessed using QUADAS-2, methodological quality was evaluated using Radiomics Quality Score (RQS) and a Checklist for AI in Medical Imaging (CLAIM). Bivariate random-effects models were used for meta-analysis of sensitivity and specificity, I<sup>2</sup> values and subgroup analysis used to assess heterogeneity.</p><p><strong>Results: </strong>Nine studies were included and 12,788 participants were evaluated, of which 3,997 were included in the meta-analysis. AI models based on CT scans showed an accuracy of 88.7% (IC 95%, 87.7%-89.7%), sensitivity of 87.9% (95% CI, 82.9%-91.6%), and specificity of 92.2% (95% CI, 86.8%-95.5%). The average score of six radiomics studies was 17.83 RQS points. Nine ML methods had an average CLAIM score of 30.55 points.</p><p><strong>Conclusions: </strong>Our study is the first to quantitatively interpret various independent research, offering insights for clinical application. Despite favorable sensitivity and specificity results, the studies were of low quality, limiting definitive conclusions. Further research is necessary to validate these models before widespread adoption.</p>","PeriodicalId":7126,"journal":{"name":"Abdominal Radiology","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Abdominal Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00261-024-04771-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To evaluate the diagnostic ability and methodological quality of ML models in detecting Pancreatic Ductal Adenocarcinoma (PDAC) in Contrast CT images.
Method: Included studies assessed adults diagnosed with PDAC, confirmed by histopathology. Metrics of tests were interpreted by ML algorithms. Studies provided data on sensitivity and specificity. Studies that did not meet the inclusion criteria, segmentation-focused studies, multiple classifiers or non-diagnostic studies were excluded. PubMed, Cochrane Central Register of Controlled Trials, and Embase were searched without restrictions. Risk of bias was assessed using QUADAS-2, methodological quality was evaluated using Radiomics Quality Score (RQS) and a Checklist for AI in Medical Imaging (CLAIM). Bivariate random-effects models were used for meta-analysis of sensitivity and specificity, I2 values and subgroup analysis used to assess heterogeneity.
Results: Nine studies were included and 12,788 participants were evaluated, of which 3,997 were included in the meta-analysis. AI models based on CT scans showed an accuracy of 88.7% (IC 95%, 87.7%-89.7%), sensitivity of 87.9% (95% CI, 82.9%-91.6%), and specificity of 92.2% (95% CI, 86.8%-95.5%). The average score of six radiomics studies was 17.83 RQS points. Nine ML methods had an average CLAIM score of 30.55 points.
Conclusions: Our study is the first to quantitatively interpret various independent research, offering insights for clinical application. Despite favorable sensitivity and specificity results, the studies were of low quality, limiting definitive conclusions. Further research is necessary to validate these models before widespread adoption.
期刊介绍:
Abdominal Radiology seeks to meet the professional needs of the abdominal radiologist by publishing clinically pertinent original, review and practice related articles on the gastrointestinal and genitourinary tracts and abdominal interventional and radiologic procedures. Case reports are generally not accepted unless they are the first report of a new disease or condition, or part of a special solicited section.
Reasons to Publish Your Article in Abdominal Radiology:
· Official journal of the Society of Abdominal Radiology (SAR)
· Published in Cooperation with:
European Society of Gastrointestinal and Abdominal Radiology (ESGAR)
European Society of Urogenital Radiology (ESUR)
Asian Society of Abdominal Radiology (ASAR)
· Efficient handling and Expeditious review
· Author feedback is provided in a mentoring style
· Global readership
· Readers can earn CME credits