Rui Chen , Hu Zhang , Xingwen Huang , Haitao Han , Jinbo Jian
{"title":"基于CT放射组学的肺磨玻璃结节侵袭性预测的机器学习方法","authors":"Rui Chen , Hu Zhang , Xingwen Huang , Haitao Han , Jinbo Jian","doi":"10.1016/j.ejro.2025.100680","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).</div></div><div><h3>Methods</h3><div>A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models—logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost—were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.</div></div><div><h3>Results</h3><div>Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004–0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.</div></div><div><h3>Conclusion</h3><div>The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.</div></div>","PeriodicalId":38076,"journal":{"name":"European Journal of Radiology Open","volume":"15 ","pages":"Article 100680"},"PeriodicalIF":2.9000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CT Radiomics-based machine learning approach for the invasiveness of pulmonary ground-glass nodules prediction\",\"authors\":\"Rui Chen , Hu Zhang , Xingwen Huang , Haitao Han , Jinbo Jian\",\"doi\":\"10.1016/j.ejro.2025.100680\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).</div></div><div><h3>Methods</h3><div>A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models—logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost—were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.</div></div><div><h3>Results</h3><div>Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004–0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.</div></div><div><h3>Conclusion</h3><div>The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.</div></div>\",\"PeriodicalId\":38076,\"journal\":{\"name\":\"European Journal of Radiology Open\",\"volume\":\"15 \",\"pages\":\"Article 100680\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Radiology Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352047725000474\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352047725000474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
CT Radiomics-based machine learning approach for the invasiveness of pulmonary ground-glass nodules prediction
Objective
To develop and validate a machine learning model based on CT radiomics to improve the ability to differentiate pathological subtypes of pulmonary ground-glass nodules (GGN).
Methods
A retrospective analysis was conducted on clinical data and radiological images from 392 patients with lung adenocarcinoma at Binzhou Medical University Hospital between January 1, 2020 to May 31, 2023. All patients underwent preoperative thin-section chest CT scans and surgical resection. A total of 400 GGNs were included. Regions of interest (ROI) were delineated on the slice showing the largest diameter of the lesions. Based on pathological confirmation, the nodules were divided into two groups: Group 1 (adenocarcinoma in situ, AIS or minimally invasive adenocarcinoma, MIA, 209 nodules) and Group 2 (invasive adenocarcinoma, IAC, 191nodules). The dataset was randomly split into a training set (280 nodules, 70 %) and a validation set (120 nodules, 30 %) at a 7:3 ratio. In the training set, feature dimensionality reduction was performed using minimum redundancy maximum relevance (mRMR) as well as least absolute shrinkage and selection operator (LASSO) to screen out discriminative radiomics features. Then seven machine learning models—logistic regression (LR), support vector machine (SVM), random forest (RF), extra trees, XGBoost, GradientBoosting, and AdaBoost—were constructed. Model performance and prediction efficacy were evaluated based on indicators such as area under the curve (AUC), accuracy, specificity, and sensitivity using receiver operating characteristic (ROC) curves.
Results
Eight radiomics features were ultimately identified. Among the seven models, the GradientBoosting model exhibited the best performance, achieving an AUC of 0.929 (95 % CI: 0.9004–0.9584), accuracy of 0.85, sensitivity of 0.851, and specificity of 0.849 in the training set.
Conclusion
The GradientBoosting model based on CT radiomics features demonstrates superior performance in predicting pathological subtypes of ground glass nodular lung adenocarcinoma, providing a reliable auxiliary tool for clinical diagnosis.