Machine Learning Model for Predicting Pathological Invasiveness of Pulmonary Ground-Glass Nodules Based on AI-Extracted Radiomic Features.

IF 2.3 3区医学 Q3 ONCOLOGY

Thoracic Cancer Pub Date : 2025-08-01 DOI:10.1111/1759-7714.70128

Guozhen Yang, Yuanheng Huang, Huiguo Chen, Weibin Wu, Yonghui Wu, Kai Zhang, Xiaojun Li, Jiannan Xu, Jian Zhang

{"title":"Machine Learning Model for Predicting Pathological Invasiveness of Pulmonary Ground-Glass Nodules Based on AI-Extracted Radiomic Features.","authors":"Guozhen Yang, Yuanheng Huang, Huiguo Chen, Weibin Wu, Yonghui Wu, Kai Zhang, Xiaojun Li, Jiannan Xu, Jian Zhang","doi":"10.1111/1759-7714.70128","DOIUrl":null,"url":null,"abstract":"Background: With the widespread adoption of low-dose CT screening, the detection of pulmonary ground-glass nodules (GGNs) has risen markedly, presenting diagnostic challenges in distinguishing preinvasive lesions from invasive adenocarcinomas (IAC). This study aimed to develop a machine learning (ML)-based model using artificial intelligence (AI)-extracted CT radiomic features to predict the invasiveness of GGNs.Methods: A retrospective cohort of 285 patients (148 with preinvasive lesions, 137 with IAC) from the Lingnan Campus was divided into training and internal validation sets (8:2). An independent cohort of 210 patients (118 with preinvasive lesions, 92 with IAC) from the Tianhe Campus served as external validation. Nineteen radiomic features were extracted and filtered using Boruta and LASSO algorithms. Seven ML classifiers were evaluated using AUC-ROC, decision curve analysis (DCA), and SHAP interpretability.Results: Median CT value, skewness, 3D long-axis diameter, and transverse diameter were ultimately selected for model construction. Among all classifiers, the Gradient Boosting Machine (GBM) model achieved the best performance (AUC = 0.965 training, 0.908 internal validation, and 0.965 external validation). It demonstrated strong accuracy (88.1%), specificity (80.7%), and F1 score (0.87) in the external validation cohort. The GBM model demonstrated superior net clinical benefit. SHAP analysis identified median CT value and skewness as the most influential predictors.Conclusion: This study presents a simplified ML model using AI-extracted radiomic features, which has strong predictive performance and biological interpretability for preoperative risk stratification of GGNs. By leveraging median CT value, skewness, 3D long-axis diameter, and transverse diameter, the model enables accurate and noninvasive differentiation between IAC and indolent lesions, supporting precise surgical planning.","PeriodicalId":23338,"journal":{"name":"Thoracic Cancer","volume":"16 15","pages":"e70128"},"PeriodicalIF":2.3000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12313823/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Thoracic Cancer","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/1759-7714.70128","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: With the widespread adoption of low-dose CT screening, the detection of pulmonary ground-glass nodules (GGNs) has risen markedly, presenting diagnostic challenges in distinguishing preinvasive lesions from invasive adenocarcinomas (IAC). This study aimed to develop a machine learning (ML)-based model using artificial intelligence (AI)-extracted CT radiomic features to predict the invasiveness of GGNs.

Methods: A retrospective cohort of 285 patients (148 with preinvasive lesions, 137 with IAC) from the Lingnan Campus was divided into training and internal validation sets (8:2). An independent cohort of 210 patients (118 with preinvasive lesions, 92 with IAC) from the Tianhe Campus served as external validation. Nineteen radiomic features were extracted and filtered using Boruta and LASSO algorithms. Seven ML classifiers were evaluated using AUC-ROC, decision curve analysis (DCA), and SHAP interpretability.

Results: Median CT value, skewness, 3D long-axis diameter, and transverse diameter were ultimately selected for model construction. Among all classifiers, the Gradient Boosting Machine (GBM) model achieved the best performance (AUC = 0.965 training, 0.908 internal validation, and 0.965 external validation). It demonstrated strong accuracy (88.1%), specificity (80.7%), and F1 score (0.87) in the external validation cohort. The GBM model demonstrated superior net clinical benefit. SHAP analysis identified median CT value and skewness as the most influential predictors.

Conclusion: This study presents a simplified ML model using AI-extracted radiomic features, which has strong predictive performance and biological interpretability for preoperative risk stratification of GGNs. By leveraging median CT value, skewness, 3D long-axis diameter, and transverse diameter, the model enables accurate and noninvasive differentiation between IAC and indolent lesions, supporting precise surgical planning.

Abstract Image

查看原文本刊更多论文

基于ai提取放射学特征预测肺磨玻璃结节病理侵袭的机器学习模型。

背景：随着低剂量CT筛查的广泛采用，肺磨玻璃结节（ggn）的检出率显著上升，这对区分浸润前病变和浸润性腺癌（IAC）提出了诊断挑战。本研究旨在开发一种基于机器学习（ML）的模型，利用人工智能（AI）提取的CT放射学特征来预测ggn的侵袭性。方法：回顾性研究岭南校区285例患者（侵袭前病变148例，IAC 137例），分为训练组和内部验证组（比例为8:2）。来自天河校区的210例独立队列患者（118例为侵袭前病变，92例为IAC）作为外部验证。采用Boruta和LASSO算法对19个放射性特征进行了提取和滤波。采用AUC-ROC、决策曲线分析（DCA）和SHAP可解释性对7个ML分类器进行评估。结果：最终选择CT中位值、偏度、三维长轴直径和横向直径进行模型构建。在所有分类器中，梯度增强机（Gradient Boosting Machine， GBM）模型的训练AUC = 0.965，内部验证AUC = 0.908，外部验证AUC = 0.965。在外部验证队列中，该方法具有较高的准确性（88.1%）、特异性（80.7%）和F1评分（0.87）。GBM模型显示出优越的净临床效益。SHAP分析发现中位CT值和偏度是最具影响力的预测因子。结论：本研究提出了一种基于人工智能提取放射学特征的简化ML模型，该模型对ggn术前风险分层具有较强的预测性能和生物学可解释性。通过利用中位CT值、偏度、三维长轴直径和横向直径，该模型能够准确、无创地区分IAC和惰性病变，支持精确的手术计划。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Thoracic Cancer ONCOLOGY-RESPIRATORY SYSTEM

CiteScore

5.20

自引率

3.40%

发文量

439

审稿时长

2 months

期刊介绍： Thoracic Cancer aims to facilitate international collaboration and exchange of comprehensive and cutting-edge information on basic, translational, and applied clinical research in lung cancer, esophageal cancer, mediastinal cancer, breast cancer and other thoracic malignancies. Prevention, treatment and research relevant to Asia-Pacific is a focus area, but submissions from all regions are welcomed. The editors encourage contributions relevant to prevention, general thoracic surgery, medical oncology, radiology, radiation medicine, pathology, basic cancer research, as well as epidemiological and translational studies in thoracic cancer. Thoracic Cancer is the official publication of the Chinese Society of Lung Cancer, International Chinese Society of Thoracic Surgery and is endorsed by the Korean Association for the Study of Lung Cancer and the Hong Kong Cancer Therapy Society. The Journal publishes a range of article types including: Editorials, Invited Reviews, Mini Reviews, Original Articles, Clinical Guidelines, Technological Notes, Imaging in thoracic cancer, Meeting Reports, Case Reports, Letters to the Editor, Commentaries, and Brief Reports.