Impact of harmonization and oversampling methods on radiomics analysis of multi-center imbalanced datasets: application to PET-based prediction of lung cancer subtypes.
IF 3 2区 医学Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Dongyang Du, Isaac Shiri, Fereshteh Yousefirizi, Mohammad R Salmanpour, Jieqin Lv, Huiqin Wu, Wentao Zhu, Habib Zaidi, Lijun Lu, Arman Rahmim
{"title":"Impact of harmonization and oversampling methods on radiomics analysis of multi-center imbalanced datasets: application to PET-based prediction of lung cancer subtypes.","authors":"Dongyang Du, Isaac Shiri, Fereshteh Yousefirizi, Mohammad R Salmanpour, Jieqin Lv, Huiqin Wu, Wentao Zhu, Habib Zaidi, Lijun Lu, Arman Rahmim","doi":"10.1186/s40658-025-00750-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Medical imaging data frequently encounter image-generation heterogeneity and class imbalance properties, challenging strong generalized predictive performances with data-driven machine-learning methods. The purpose of this study was to investigate the impact of harmonization and oversampling methods on multi-center imbalanced datasets, with specific application to PET-based radiomics modeling for histologic subtype prediction in non-small cell lung cancer (NSCLC).</p><p><strong>Methods: </strong>The retrospective study included 245 patients with adenocarcinoma (ADC) and 78 patients with squamous cell carcinoma (SCC) from 4 centers. Utilizing 1502 radiomics features per patient, we trained, validated, and tested 4 machine-learning classifiers, to investigate the effect of no harmonization (NoH) or 4 feature harmonization methods, paired with no oversampling (NoO) or 5 oversampling methods on subtype prediction. Model performance was evaluated using the average area under the ROC curve (AUROC) and G-mean via 5 times 5-fold cross-validations. Statistical comparisons of the combined models against baseline (NoH + NoO) were performed for each fold of cross-validation using the DeLong test.</p><p><strong>Results: </strong>The number of cross-combinations with both AUROC and G-mean outperforming baseline in validation and testing was 15, 4, 2, and 7 (out of 29) for random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), and support vector machine (SVM), respectively. ComBat harmonization combined with oversampling (SMOTE) via RF yielded better performance than baseline (AUROC and G-mean of validation: 0.725 vs. 0.608 and 0.625 vs. 0.398; testing: 0.637 vs. 0.567 and 0.506 vs. 0.287), though statistical significances were not observed.</p><p><strong>Conclusions: </strong>Applying harmonization and oversampling methods in multi-center imbalanced datasets can improve NSCLC-subtype prediction, but the effect varies widely across classifiers. We have created open-source comparisons of harmonization and oversampling on different classifiers for comprehensive evaluations in different studies.</p>","PeriodicalId":11559,"journal":{"name":"EJNMMI Physics","volume":"12 1","pages":"34"},"PeriodicalIF":3.0000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11977052/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EJNMMI Physics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40658-025-00750-7","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Medical imaging data frequently encounter image-generation heterogeneity and class imbalance properties, challenging strong generalized predictive performances with data-driven machine-learning methods. The purpose of this study was to investigate the impact of harmonization and oversampling methods on multi-center imbalanced datasets, with specific application to PET-based radiomics modeling for histologic subtype prediction in non-small cell lung cancer (NSCLC).
Methods: The retrospective study included 245 patients with adenocarcinoma (ADC) and 78 patients with squamous cell carcinoma (SCC) from 4 centers. Utilizing 1502 radiomics features per patient, we trained, validated, and tested 4 machine-learning classifiers, to investigate the effect of no harmonization (NoH) or 4 feature harmonization methods, paired with no oversampling (NoO) or 5 oversampling methods on subtype prediction. Model performance was evaluated using the average area under the ROC curve (AUROC) and G-mean via 5 times 5-fold cross-validations. Statistical comparisons of the combined models against baseline (NoH + NoO) were performed for each fold of cross-validation using the DeLong test.
Results: The number of cross-combinations with both AUROC and G-mean outperforming baseline in validation and testing was 15, 4, 2, and 7 (out of 29) for random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), and support vector machine (SVM), respectively. ComBat harmonization combined with oversampling (SMOTE) via RF yielded better performance than baseline (AUROC and G-mean of validation: 0.725 vs. 0.608 and 0.625 vs. 0.398; testing: 0.637 vs. 0.567 and 0.506 vs. 0.287), though statistical significances were not observed.
Conclusions: Applying harmonization and oversampling methods in multi-center imbalanced datasets can improve NSCLC-subtype prediction, but the effect varies widely across classifiers. We have created open-source comparisons of harmonization and oversampling on different classifiers for comprehensive evaluations in different studies.
期刊介绍:
EJNMMI Physics is an international platform for scientists, users and adopters of nuclear medicine with a particular interest in physics matters. As a companion journal to the European Journal of Nuclear Medicine and Molecular Imaging, this journal has a multi-disciplinary approach and welcomes original materials and studies with a focus on applied physics and mathematics as well as imaging systems engineering and prototyping in nuclear medicine. This includes physics-driven approaches or algorithms supported by physics that foster early clinical adoption of nuclear medicine imaging and therapy.