Dang Zhang, Xiaoming Wu, Bo Wang, Xinran Wang, Peilin Sheng, Wei Jin, Lilin Guo, Xiaobo Lai, Jian Xu, Jianqing Wang
{"title":"Radiomics-Driven Lung Adenocarcinoma Subtype Classification","authors":"Dang Zhang, Xiaoming Wu, Bo Wang, Xinran Wang, Peilin Sheng, Wei Jin, Lilin Guo, Xiaobo Lai, Jian Xu, Jianqing Wang","doi":"10.1002/ima.70211","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>ObjectiveThis study aimed to identify the optimal classification model for lung adenocarcinoma (LUAD) subtypes through radiomics-driven analysis, addressing challenges such as data set imbalance, small sample sizes, and the need for accurate multi-class classification.MethodsRadiomic features were extracted from CT scans and integrated with machine-learning and deep-learning techniques to improve diagnostic accuracy. After preliminary feature selection, the most effective feature subsets were identified by comparing single-stage and multi-stage feature selection methods, such as recursive feature elimination (RFE), random forest (RF), and Lasso. SMOTE techniques were applied to address class imbalance through data augmentation, and loss functions such as cross-entropy were used for model training and evaluation. Finally, classification was performed using RF, KNN, GBDT, SVM, Stacking, Voting, and deep-learning models (ResNet-18, ResNet-50, VGG16, etc.).ResultsThe MStacking model, based on mutual information (MI) and the stacking ensemble algorithm, achieved superior performance with a classification accuracy of 82.00%, precision of 82.00%, F1 score of 83.00%, AUC of 95.00%, sensitivity of 79.00%, and specificity of 94.00%. These results outperformed other methods. Deep-learning models showed limited performance when trained on small sample sizes. However, when integrated with radiomics features, CNN models, particularly ResNet-50, demonstrated significantly improved performance, especially when addressing class imbalance using SMOTE, with ResNet-50's accuracy increasing by 20%. The MStacking model also showed stable performance in multi-class tasks.ConclusionRadiomics-driven deep-learning models demonstrated a significant advantage in LUAD subtype classification, particularly when dealing with small sample sizes. Integrating radiomics features enhanced the performance of deep-learning models, offering a promising approach for LUAD classification.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70211","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
ObjectiveThis study aimed to identify the optimal classification model for lung adenocarcinoma (LUAD) subtypes through radiomics-driven analysis, addressing challenges such as data set imbalance, small sample sizes, and the need for accurate multi-class classification.MethodsRadiomic features were extracted from CT scans and integrated with machine-learning and deep-learning techniques to improve diagnostic accuracy. After preliminary feature selection, the most effective feature subsets were identified by comparing single-stage and multi-stage feature selection methods, such as recursive feature elimination (RFE), random forest (RF), and Lasso. SMOTE techniques were applied to address class imbalance through data augmentation, and loss functions such as cross-entropy were used for model training and evaluation. Finally, classification was performed using RF, KNN, GBDT, SVM, Stacking, Voting, and deep-learning models (ResNet-18, ResNet-50, VGG16, etc.).ResultsThe MStacking model, based on mutual information (MI) and the stacking ensemble algorithm, achieved superior performance with a classification accuracy of 82.00%, precision of 82.00%, F1 score of 83.00%, AUC of 95.00%, sensitivity of 79.00%, and specificity of 94.00%. These results outperformed other methods. Deep-learning models showed limited performance when trained on small sample sizes. However, when integrated with radiomics features, CNN models, particularly ResNet-50, demonstrated significantly improved performance, especially when addressing class imbalance using SMOTE, with ResNet-50's accuracy increasing by 20%. The MStacking model also showed stable performance in multi-class tasks.ConclusionRadiomics-driven deep-learning models demonstrated a significant advantage in LUAD subtype classification, particularly when dealing with small sample sizes. Integrating radiomics features enhanced the performance of deep-learning models, offering a promising approach for LUAD classification.
期刊介绍:
The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals.
IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging.
The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered.
The scope of the journal includes, but is not limited to, the following in the context of biomedical research:
Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.;
Neuromodulation and brain stimulation techniques such as TMS and tDCS;
Software and hardware for imaging, especially related to human and animal health;
Image segmentation in normal and clinical populations;
Pattern analysis and classification using machine learning techniques;
Computational modeling and analysis;
Brain connectivity and connectomics;
Systems-level characterization of brain function;
Neural networks and neurorobotics;
Computer vision, based on human/animal physiology;
Brain-computer interface (BCI) technology;
Big data, databasing and data mining.