CT Radiomics-based machine learning approach for the invasiveness of pulmonary ground-glass nodules prediction

IF 2.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Rui Chen , Hu Zhang , Xingwen Huang , Haitao Han , Jinbo Jian
{"title":"CT Radiomics-based machine learning approach for the invasiveness of pulmonary ground-glass nodules prediction","authors":"Rui Chen ,&nbsp;Hu Zhang ,&nbsp;Xingwen Huang ,&nbsp;Haitao Han ,&nbsp;Jinbo Jian","doi":"10.1016/j.ejro.2025.100680","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).</div></div><div><h3>Methods</h3><div>A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models—logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost—were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.</div></div><div><h3>Results</h3><div>Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004–0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.</div></div><div><h3>Conclusion</h3><div>The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.</div></div>","PeriodicalId":38076,"journal":{"name":"European Journal of Radiology Open","volume":"15 ","pages":"Article 100680"},"PeriodicalIF":2.9000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352047725000474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).

Methods

A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models—logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost—were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.

Results

Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004–0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.

Conclusion

The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.
基于CT放射组学的肺磨玻璃结节侵袭性预测的机器学习方法
目的建立并验证基于CT放射组学的机器学习模型,以提高肺磨玻璃结节(GGN)病理亚型的鉴别能力。方法回顾性分析滨州医科大学附属医院2020年1月1日至2023年5月31日392例肺腺癌患者的临床资料和影像学资料。所有患者术前均行胸部薄层CT扫描和手术切除。共纳入400个ggn。感兴趣区域(ROI)在显示病变最大直径的切片上勾画。根据病理证实,将结节分为两组:1组(原位腺癌,AIS或微创腺癌,MIA, 209个结节)和2组(侵袭性腺癌,IAC, 191个结节)。数据集以7:3的比例随机分为训练集(280个结节,70 %)和验证集(120个结节,30 %)。在训练集中,使用最小冗余最大相关性(mRMR)以及最小绝对收缩和选择算子(LASSO)进行特征降维,以筛选出判别性放射组学特征。然后构建了逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、额外树(extra trees)、XGBoost、GradientBoosting和adaboost等7个机器学习模型。采用受试者工作特征(ROC)曲线,根据曲线下面积(AUC)、准确度、特异性和敏感性等指标评价模型的性能和预测效果。结果最终确定了八个放射组学特征。7个模型中,GradientBoosting模型表现最好,AUC为0.929(95 % CI: 0.9004-0.9584),准确率为0.85,灵敏度为0.851,特异性为0.849。结论基于CT放射组学特征的GradientBoosting模型在预测磨玻璃结节性肺腺癌病理亚型方面具有较好的效果,为临床诊断提供了可靠的辅助工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
European Journal of Radiology Open
European Journal of Radiology Open Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
4.10
自引率
5.00%
发文量
55
审稿时长
51 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信