Claire Huchthausen, Menglin Shi, Gabriel L A de Sousa, Jonathan Colen, Emery Shelley, James Larner, Einsley Janowski, Krishni Wijesooriya
{"title":"Evaluation of radiomic feature harmonization techniques for benign and malignant pulmonary nodules.","authors":"Claire Huchthausen, Menglin Shi, Gabriel L A de Sousa, Jonathan Colen, Emery Shelley, James Larner, Einsley Janowski, Krishni Wijesooriya","doi":"","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Conventional methods for detecting lung cancer early are often qualitative and subject to interpretation. Radiomics provides quantitative characteristics of pulmonary nodules (PNs) in medical images, but variability in medical image acquisition is an obstacle to consistent clinical application of these quantitative features. Correcting radiomic features' dependency on acquisition parameters is problematic when combining data from benign and malignant PNs, as is necessary when the goal is to diagnose lung cancer, because acquisition effects may differ between them due to their biological differences.</p><p><strong>Purpose: </strong>We evaluated whether we must account for biological differences between benign and malignant PNs when correcting the dependency of radiomic features on acquisition parameters, and we compared methods of doing this using ComBat harmonization.</p><p><strong>Methods: </strong>This study used a dataset of 567 clinical chest CT scans containing both malignant and benign PNs. Scans were grouped as benign, malignant, or lung cancer screening (mixed benign and malignant). Preprocessing and feature extraction from ROIs were performed using PyRadiomics. Optimized Permutation Nested ComBat harmonization was performed on extracted features to account for variability in four imaging protocols: contrast enhancement, scanner manufacturer, acquisition voltage, focal spot size. Three methods were compared: harmonizing all data collectively in the standard manner, harmonizing all data with a covariate to preserve distinctions between subgroups, and harmonizing subgroups separately. A significant (<i>p</i> ≤ 0.05) Kruskal-Wallis test determined whether harmonization removed a feature's dependency on an acquisition parameter. A LASSO-SVM pipeline was trained using acquisition-independent radiomic features to predict whether PNs were malignant or benign. To evaluate the predictive information made available by each harmonization method, the trained harmonization estimators and predictive model were applied to a corresponding unseen test set. Harmonization and predictive performance metrics were assessed over 10 trials of 5-fold cross validation.</p><p><strong>Results: </strong>Kruskal-Wallis defined an average 2.1% of features (95% CI: 1.9-2.4%) as acquisition-independent when data were harmonized collectively, 27.3% of features (95% CI: 25.7-28.9%) as acquisition-independent when harmonized with a covariate, and 90.9% of features (95% CI: 90.4-91.5%) as acquisition-independent when harmonized separately. LASSO-SVM models trained on data harmonized separately or with a covariate had higher ROC-AUC for lung cancer screening scans than models trained on data harmonized without distinction between benign and malignant tissues (Delong test, Holm-Bonferroni adjusted <i>p</i> ≤ 0.05). There was not a conclusive difference in ROC-AUC between models trained on data harmonized separately and models trained on data harmonized with a covariate.</p><p><strong>Conclusions: </strong>Radiomic features of benign and malignant PNs require different corrective transformations to recover acquisition-independent distributions. This can be done using separate harmonization or harmonization with a covariate. Separate harmonization enabled the greatest number of predictive features to be used in a machine learning model to retrospectively detect lung cancer. Features harmonized separately and features harmonized with a covariate enabled predictive models to achieve similar performance on lung cancer screening scans.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11774441/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Conventional methods for detecting lung cancer early are often qualitative and subject to interpretation. Radiomics provides quantitative characteristics of pulmonary nodules (PNs) in medical images, but variability in medical image acquisition is an obstacle to consistent clinical application of these quantitative features. Correcting radiomic features' dependency on acquisition parameters is problematic when combining data from benign and malignant PNs, as is necessary when the goal is to diagnose lung cancer, because acquisition effects may differ between them due to their biological differences.
Purpose: We evaluated whether we must account for biological differences between benign and malignant PNs when correcting the dependency of radiomic features on acquisition parameters, and we compared methods of doing this using ComBat harmonization.
Methods: This study used a dataset of 567 clinical chest CT scans containing both malignant and benign PNs. Scans were grouped as benign, malignant, or lung cancer screening (mixed benign and malignant). Preprocessing and feature extraction from ROIs were performed using PyRadiomics. Optimized Permutation Nested ComBat harmonization was performed on extracted features to account for variability in four imaging protocols: contrast enhancement, scanner manufacturer, acquisition voltage, focal spot size. Three methods were compared: harmonizing all data collectively in the standard manner, harmonizing all data with a covariate to preserve distinctions between subgroups, and harmonizing subgroups separately. A significant (p ≤ 0.05) Kruskal-Wallis test determined whether harmonization removed a feature's dependency on an acquisition parameter. A LASSO-SVM pipeline was trained using acquisition-independent radiomic features to predict whether PNs were malignant or benign. To evaluate the predictive information made available by each harmonization method, the trained harmonization estimators and predictive model were applied to a corresponding unseen test set. Harmonization and predictive performance metrics were assessed over 10 trials of 5-fold cross validation.
Results: Kruskal-Wallis defined an average 2.1% of features (95% CI: 1.9-2.4%) as acquisition-independent when data were harmonized collectively, 27.3% of features (95% CI: 25.7-28.9%) as acquisition-independent when harmonized with a covariate, and 90.9% of features (95% CI: 90.4-91.5%) as acquisition-independent when harmonized separately. LASSO-SVM models trained on data harmonized separately or with a covariate had higher ROC-AUC for lung cancer screening scans than models trained on data harmonized without distinction between benign and malignant tissues (Delong test, Holm-Bonferroni adjusted p ≤ 0.05). There was not a conclusive difference in ROC-AUC between models trained on data harmonized separately and models trained on data harmonized with a covariate.
Conclusions: Radiomic features of benign and malignant PNs require different corrective transformations to recover acquisition-independent distributions. This can be done using separate harmonization or harmonization with a covariate. Separate harmonization enabled the greatest number of predictive features to be used in a machine learning model to retrospectively detect lung cancer. Features harmonized separately and features harmonized with a covariate enabled predictive models to achieve similar performance on lung cancer screening scans.