Deep learning for liver lesion segmentation and classification on staging CT scans of colorectal cancer patients: a multi-site technical validation study
IF 2.1 3区 医学Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
U. Bashir , C. Wang , R. Smillie , A.K. Rayabat Khan , H. Tamer Ahmed , K. Ordidge , N. Power , M. Gerlinger , G. Slabaugh , Q. Zhang
{"title":"Deep learning for liver lesion segmentation and classification on staging CT scans of colorectal cancer patients: a multi-site technical validation study","authors":"U. Bashir , C. Wang , R. Smillie , A.K. Rayabat Khan , H. Tamer Ahmed , K. Ordidge , N. Power , M. Gerlinger , G. Slabaugh , Q. Zhang","doi":"10.1016/j.crad.2025.106914","DOIUrl":null,"url":null,"abstract":"<div><h3>AIM</h3><div>To validate a liver lesion detection and classification model using staging computed tomography (CT) scans of colorectal cancer (CRC) patients.</div></div><div><h3>MATERIALS AND METHODS</h3><div>A UNet-based deep learning model was trained on 272 public liver tumour CT scans and tested on 220 CRC staging CTs acquired from a single institution (2014–2019). Performance metrics included lesion detection rates by size (<10 mm, 10–20 mm, >20 mm), segmentation accuracy (dice similarity coefficient, DSC), volume measurement agreement (Bland–Altman limits of agreement, LOAs; intraclass correlation coefficient, ICC), and classification accuracy (malignant vs benign) at patient and lesion levels (detected lesions only).</div></div><div><h3>RESULTS</h3><div>The model detected 743 out of 884 lesions (84%), with detection rates of 75%, 91.3%, and 96% for lesions <10 mm, 10–20 mm, and >20 mm, respectively. The median DSC was 0.76 (95% CI: 0.72–0.80) for lesions <10 mm, 0.83 (95% CI: 0.79–0.86) for 10–20 mm, and 0.85 (95% CI: 0.82–0.88) for >20 mm. Bland–Altman analysis showed a mean volume bias of -0.12 cm<sup>3</sup> (LOAs: -1.68 to +1.43 cm<sup>3</sup>), and ICC was 0.81. Lesion-level classification showed 99.5% sensitivity, 65.7% specificity, 53.8% positive predictive value (PPV), 99.7% negative predictive value (NPV), and 75.4% accuracy. Patient-level classification had 100% sensitivity, 27.1% specificity, 59.2% PPV, 100% NPV, and 64.5% accuracy.</div></div><div><h3>CONCLUSION</h3><div>The model demonstrates strong lesion detection and segmentation performance, particularly for sub-centimetre lesions. Although classification accuracy was moderate, the 100% NPV suggests strong potential as a CRC staging screening tool. Future studies will assess its impact on radiologist performance and efficiency.</div></div>","PeriodicalId":10695,"journal":{"name":"Clinical radiology","volume":"85 ","pages":"Article 106914"},"PeriodicalIF":2.1000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0009926025001199","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
AIM
To validate a liver lesion detection and classification model using staging computed tomography (CT) scans of colorectal cancer (CRC) patients.
MATERIALS AND METHODS
A UNet-based deep learning model was trained on 272 public liver tumour CT scans and tested on 220 CRC staging CTs acquired from a single institution (2014–2019). Performance metrics included lesion detection rates by size (<10 mm, 10–20 mm, >20 mm), segmentation accuracy (dice similarity coefficient, DSC), volume measurement agreement (Bland–Altman limits of agreement, LOAs; intraclass correlation coefficient, ICC), and classification accuracy (malignant vs benign) at patient and lesion levels (detected lesions only).
RESULTS
The model detected 743 out of 884 lesions (84%), with detection rates of 75%, 91.3%, and 96% for lesions <10 mm, 10–20 mm, and >20 mm, respectively. The median DSC was 0.76 (95% CI: 0.72–0.80) for lesions <10 mm, 0.83 (95% CI: 0.79–0.86) for 10–20 mm, and 0.85 (95% CI: 0.82–0.88) for >20 mm. Bland–Altman analysis showed a mean volume bias of -0.12 cm3 (LOAs: -1.68 to +1.43 cm3), and ICC was 0.81. Lesion-level classification showed 99.5% sensitivity, 65.7% specificity, 53.8% positive predictive value (PPV), 99.7% negative predictive value (NPV), and 75.4% accuracy. Patient-level classification had 100% sensitivity, 27.1% specificity, 59.2% PPV, 100% NPV, and 64.5% accuracy.
CONCLUSION
The model demonstrates strong lesion detection and segmentation performance, particularly for sub-centimetre lesions. Although classification accuracy was moderate, the 100% NPV suggests strong potential as a CRC staging screening tool. Future studies will assess its impact on radiologist performance and efficiency.
期刊介绍:
Clinical Radiology is published by Elsevier on behalf of The Royal College of Radiologists. Clinical Radiology is an International Journal bringing you original research, editorials and review articles on all aspects of diagnostic imaging, including:
• Computed tomography
• Magnetic resonance imaging
• Ultrasonography
• Digital radiology
• Interventional radiology
• Radiography
• Nuclear medicine
Papers on radiological protection, quality assurance, audit in radiology and matters relating to radiological training and education are also included. In addition, each issue contains correspondence, book reviews and notices of forthcoming events.