A radiomics-based machine learning model and SHAP for predicting spread through air spaces and its prognostic implications in stage I lung adenocarcinoma: a multicenter cohort study.

IF 3.5 2区 医学 Q2 ONCOLOGY
Yuhang Wang, Xufeng Liu, Xiaojiang Zhao, Zixiao Wang, Xin Li, Daqiang Sun
{"title":"A radiomics-based machine learning model and SHAP for predicting spread through air spaces and its prognostic implications in stage I lung adenocarcinoma: a multicenter cohort study.","authors":"Yuhang Wang, Xufeng Liu, Xiaojiang Zhao, Zixiao Wang, Xin Li, Daqiang Sun","doi":"10.1186/s40644-025-00935-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Despite early detection via low-dose computed tomography and complete surgical resection for early-stage lung adenocarcinoma, postoperative recurrence remains high, particularly in patients with tumor spread through air spaces. A reliable preoperative prediction model is urgently needed to adjust the treatment modality.</p><p><strong>Methods: </strong>In this multicenter retrospective study, 609 patients with pathological stage I lung adenocarcinoma from 3 independent centers were enrolled. Regions of interest for the primary tumor and peritumoral areas (extended by three, six, and twelve voxel units) were manually delineated from preoperative CT imaging. Quantitative imaging features were extracted and filtered by correlation analysis and Random forest Ranking to yield 40 candidate features. Fifteen machine learning methods were evaluated, and a ten-fold cross-validated elastic net regression model was selected to construct the radiomics-based prediction model. A clinical model based on five key clinical variables and a combined model integrating imaging and clinical features were also developed.</p><p><strong>Results: </strong>The radiomics model achieved accuracies of 0.801, 0.866, and 0.831 in the training set and two external test sets, with AUC of 0.791, 0.829, and 0.807. In one external test set, the clinical model had an AUC of 0.689, significantly lower than the radiomics model (0.807, p < 0.05). The combined model achieved the highest performance, with AUC of 0.834 in the training set and 0.894 in an external test set (p < 0.01 and p < 0.001, respectively). Interpretability analysis revealed that wavelet-transformed features dominated the model, with the highest contribution from a feature reflecting small high-intensity clusters within the tumor and the second highest from a feature representing low-intensity clusters in the six-voxel peritumoral region. Kaplan-Meier analysis demonstrated that patients with either pathologically confirmed or model-predicted spread had significantly shorter progression-free survival (p < 0.001).</p><p><strong>Conclusion: </strong>Our novel machine learning model, integrating imaging features from both tumor and peritumoral regions, preoperatively predicts tumor spread through air spaces in stage I lung adenocarcinoma. It outperforms traditional clinical models, highlighting the potential of quantitative imaging analysis in personalizing treatment. Future prospective studies and further optimization are warranted.</p>","PeriodicalId":9548,"journal":{"name":"Cancer Imaging","volume":"25 1","pages":"115"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482768/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40644-025-00935-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Despite early detection via low-dose computed tomography and complete surgical resection for early-stage lung adenocarcinoma, postoperative recurrence remains high, particularly in patients with tumor spread through air spaces. A reliable preoperative prediction model is urgently needed to adjust the treatment modality.

Methods: In this multicenter retrospective study, 609 patients with pathological stage I lung adenocarcinoma from 3 independent centers were enrolled. Regions of interest for the primary tumor and peritumoral areas (extended by three, six, and twelve voxel units) were manually delineated from preoperative CT imaging. Quantitative imaging features were extracted and filtered by correlation analysis and Random forest Ranking to yield 40 candidate features. Fifteen machine learning methods were evaluated, and a ten-fold cross-validated elastic net regression model was selected to construct the radiomics-based prediction model. A clinical model based on five key clinical variables and a combined model integrating imaging and clinical features were also developed.

Results: The radiomics model achieved accuracies of 0.801, 0.866, and 0.831 in the training set and two external test sets, with AUC of 0.791, 0.829, and 0.807. In one external test set, the clinical model had an AUC of 0.689, significantly lower than the radiomics model (0.807, p < 0.05). The combined model achieved the highest performance, with AUC of 0.834 in the training set and 0.894 in an external test set (p < 0.01 and p < 0.001, respectively). Interpretability analysis revealed that wavelet-transformed features dominated the model, with the highest contribution from a feature reflecting small high-intensity clusters within the tumor and the second highest from a feature representing low-intensity clusters in the six-voxel peritumoral region. Kaplan-Meier analysis demonstrated that patients with either pathologically confirmed or model-predicted spread had significantly shorter progression-free survival (p < 0.001).

Conclusion: Our novel machine learning model, integrating imaging features from both tumor and peritumoral regions, preoperatively predicts tumor spread through air spaces in stage I lung adenocarcinoma. It outperforms traditional clinical models, highlighting the potential of quantitative imaging analysis in personalizing treatment. Future prospective studies and further optimization are warranted.

基于放射组学的机器学习模型和SHAP用于预测I期肺腺癌通过空气传播及其预后意义:一项多中心队列研究。
背景:尽管早期通过低剂量计算机断层扫描和完全手术切除早期肺腺癌,术后复发率仍然很高,特别是肿瘤通过空气间隙扩散的患者。迫切需要可靠的术前预测模型来调整治疗方式。方法:在这项多中心回顾性研究中,来自3个独立中心的609例病理性I期肺腺癌患者入组。从术前CT图像中手动划定原发肿瘤和肿瘤周围区域(延长3、6和12体素单位)的兴趣区域。通过相关分析和随机森林排序对定量成像特征进行提取和过滤,得到40个候选特征。评估了15种机器学习方法,并选择了一个十倍交叉验证的弹性网络回归模型来构建基于放射组学的预测模型。建立了基于5个关键临床变量的临床模型和影像与临床特征相结合的临床模型。结果:放射组学模型在训练集和两个外部测试集的准确率分别为0.801、0.866和0.831,AUC分别为0.791、0.829和0.807。在一个外部测试集中,临床模型的AUC为0.689,显著低于放射组学模型(0.807,p)。结论:我们的新型机器学习模型,整合了肿瘤和肿瘤周围区域的影像学特征,可以术前预测I期肺腺癌的肿瘤通过空气间隙扩散。它优于传统的临床模型,突出了定量成像分析在个性化治疗中的潜力。未来的前瞻性研究和进一步的优化是必要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cancer Imaging
Cancer Imaging ONCOLOGY-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
7.00
自引率
0.00%
发文量
66
审稿时长
>12 weeks
期刊介绍: Cancer Imaging is an open access, peer-reviewed journal publishing original articles, reviews and editorials written by expert international radiologists working in oncology. The journal encompasses CT, MR, PET, ultrasound, radionuclide and multimodal imaging in all kinds of malignant tumours, plus new developments, techniques and innovations. Topics of interest include: Breast Imaging Chest Complications of treatment Ear, Nose & Throat Gastrointestinal Hepatobiliary & Pancreatic Imaging biomarkers Interventional Lymphoma Measurement of tumour response Molecular functional imaging Musculoskeletal Neuro oncology Nuclear Medicine Paediatric.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信