{"title":"Predictive radiomics based ensemble machine learning approach in CT lung nodule diagnosis.","authors":"Arooj Nissar, A H Mir","doi":"10.1186/s43046-025-00326-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Computed tomography imaging, a non-invasive tool, is used around the globe by medical professionals to identify and diagnose lung cancer; a lethal disease with high rates of occurrence and mortality globally. Radiomics extracted from medical images, including computed tomography, in tandem with machine learning frameworks has received considerable focus and research for lung nodule identification.This investigation can help out clinicians to reach radiomics-based better and quicker decision support system for treatments and early diagnosis. However, it is still foggy and unclear which radiomics feature(s) to use for the prediction of pulmonary nodule. Consequently, this work is offered with an endeavor to efficiently apply machine learning techniques and radiomics to classify CT pulmonary nodules.</p><p><strong>Methods: </strong>Lung Image Data Consortium (LIDC), containing 1018 CT cancer cases, is put to use. The Wavelet Packet Transform is used in conjunction with geometrical features, gray level run length matrix, gray level co-occurrence method and gray level difference method techniques to extract radiomics. Two techniques, boosted and bagged ensemble classification trees, are employed to choose an apposite set of features. The categorization of nodules as malignant or benign is assessed by the utilization of cutting-edge machine learning models: Support Vector Machines, Boosted Classification Ensemble Tree, Decision Trees, Bagged Classification Ensemble Tree, RUSBoosted Ensemble Trees, Subspace Discriminant Ensemble and Subspace KNN Ensemble.</p><p><strong>Results: </strong>The findings reveal that the Ensemble Subspace KNN gives best AUROC (93.4%), accuracy (88.3%) and F1-score (85.2%) using BACET feature selection method. The best sensitivity is produced by FGSVM (97.1%). RUSBOCET gives best precision and specificity of 93.4% and 83.1% respectively.</p><p><strong>Conclusion: </strong>Lung Cancer remains the most common and deadly type of cancer. Early detection of lung lesions and nodules is crucial in the fight against lung cancer. The purpose of this study was to investigate radiomics based on geometrical, texture, and Daubechies WPT texture features for quantitative CT image analysis. The LIDC database was used in this study. Geometrical features, texture features based on three statistical methodologies (GLCM, GLDM GLRLM) and Daubechies WPT texture features are retrieved from the nodules. Using the ensemble EFS, BOCET and BACET, pertinent features were identified. Lastly, various cutting-edge ML classifiers were used to classify LC as malignant or benign. The out-turn shows that, using BACET EFS, Ensemble Subspace KNN gives best AUROC (93.4%), accuracy (88.3%) and F1-score (85.2%). FGSVM yields the best sensitivity of 97.1%. RUSBOCET gives best precision and best specificity of 93.4% and 83.1% respectively. Therefore, the methodology can be applied with efficacy to the CT based PN classification. Thus, the result can assist medical professionals in making better decisions and interventions.</p>","PeriodicalId":17301,"journal":{"name":"Journal of the Egyptian National Cancer Institute","volume":"37 1","pages":"68"},"PeriodicalIF":1.8000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Egyptian National Cancer Institute","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s43046-025-00326-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Computed tomography imaging, a non-invasive tool, is used around the globe by medical professionals to identify and diagnose lung cancer; a lethal disease with high rates of occurrence and mortality globally. Radiomics extracted from medical images, including computed tomography, in tandem with machine learning frameworks has received considerable focus and research for lung nodule identification.This investigation can help out clinicians to reach radiomics-based better and quicker decision support system for treatments and early diagnosis. However, it is still foggy and unclear which radiomics feature(s) to use for the prediction of pulmonary nodule. Consequently, this work is offered with an endeavor to efficiently apply machine learning techniques and radiomics to classify CT pulmonary nodules.
Methods: Lung Image Data Consortium (LIDC), containing 1018 CT cancer cases, is put to use. The Wavelet Packet Transform is used in conjunction with geometrical features, gray level run length matrix, gray level co-occurrence method and gray level difference method techniques to extract radiomics. Two techniques, boosted and bagged ensemble classification trees, are employed to choose an apposite set of features. The categorization of nodules as malignant or benign is assessed by the utilization of cutting-edge machine learning models: Support Vector Machines, Boosted Classification Ensemble Tree, Decision Trees, Bagged Classification Ensemble Tree, RUSBoosted Ensemble Trees, Subspace Discriminant Ensemble and Subspace KNN Ensemble.
Results: The findings reveal that the Ensemble Subspace KNN gives best AUROC (93.4%), accuracy (88.3%) and F1-score (85.2%) using BACET feature selection method. The best sensitivity is produced by FGSVM (97.1%). RUSBOCET gives best precision and specificity of 93.4% and 83.1% respectively.
Conclusion: Lung Cancer remains the most common and deadly type of cancer. Early detection of lung lesions and nodules is crucial in the fight against lung cancer. The purpose of this study was to investigate radiomics based on geometrical, texture, and Daubechies WPT texture features for quantitative CT image analysis. The LIDC database was used in this study. Geometrical features, texture features based on three statistical methodologies (GLCM, GLDM GLRLM) and Daubechies WPT texture features are retrieved from the nodules. Using the ensemble EFS, BOCET and BACET, pertinent features were identified. Lastly, various cutting-edge ML classifiers were used to classify LC as malignant or benign. The out-turn shows that, using BACET EFS, Ensemble Subspace KNN gives best AUROC (93.4%), accuracy (88.3%) and F1-score (85.2%). FGSVM yields the best sensitivity of 97.1%. RUSBOCET gives best precision and best specificity of 93.4% and 83.1% respectively. Therefore, the methodology can be applied with efficacy to the CT based PN classification. Thus, the result can assist medical professionals in making better decisions and interventions.
期刊介绍:
As the official publication of the National Cancer Institute, Cairo University, the Journal of the Egyptian National Cancer Institute (JENCI) is an open access peer-reviewed journal that publishes on the latest innovations in oncology and thereby, providing academics and clinicians a leading research platform. JENCI welcomes submissions pertaining to all fields of basic, applied and clinical cancer research. Main topics of interest include: local and systemic anticancer therapy (with specific interest on applied cancer research from developing countries); experimental oncology; early cancer detection; randomized trials (including negatives ones); and key emerging fields of personalized medicine, such as molecular pathology, bioinformatics, and biotechnologies.