Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

IF 3.2 3区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

BMC Medical Imaging Pub Date : 2025-08-04 DOI:10.1186/s12880-025-01855-2

Khashayar Namdar, Matthias W Wagner, Birgit B Ertl-Wagner, Farzad Khalvati

{"title":"Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.","authors":"Khashayar Namdar, Matthias W Wagner, Birgit B Ertl-Wagner, Farzad Khalvati","doi":"10.1186/s12880-025-01855-2","DOIUrl":null,"url":null,"abstract":"Background: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results.Methods: We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC).Results: Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible.Conclusions: Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible.Clinical trial number: Not applicable.","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"312"},"PeriodicalIF":3.2000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12323200/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01855-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results.

Methods: We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC).

Results: Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible.

Conclusions: Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible.

Clinical trial number: Not applicable.

Abstract Image

查看原文本刊更多论文

开放放射组学：标准化数据集的集合和可重复放射组学机器学习管道的技术协议。

背景：放射组学作为医学影像领域机器学习管道的一个重要分支，面临着可重复性和可及性两大挑战。在这项工作中，我们引入了开放放射组学，一组放射组学数据集以及基于我们提出的技术协议的综合放射组学管道，以研究放射组学特征提取对结果可重复性的影响。方法：基于3个开源数据集对大规模放射组学数据集进行整理；BraTS 2020用于高级别胶质瘤（HGG）与低级别胶质瘤（LGG）的分类和生存分析，BraTS 2023用于o6 -甲基鸟嘌呤- dna甲基转移酶（MGMT）分类，以及来自癌症影像档案（TCIA）的非小细胞肺癌（NSCLC）生存分析。我们使用BraTS 2020开源磁共振成像（MRI）数据集来演示我们提出的技术方案如何用于基于放射学的研究。该队列包括369例成年脑肿瘤患者（76例LGG， 293例HGG）。使用PyRadiomics库对LGG和HGG进行分类，我们创建了288个放射组学数据集；4个MRI序列、3个binwidth、6种图像归一化方法和4个肿瘤亚区的组合。我们使用Random Forest分类器，对每个放射组学数据集，在不同的数据分割和模型随机状态下重复训练-验证-测试（60%/20%/20%）实验100次（28,800个测试结果），并计算接收者工作特征曲线下面积（AUROC）。结果：与binWidth和图像归一化不同，肿瘤亚区和成像序列显著影响模型的性能。T1增强序列和坏死与非增强肿瘤核心亚区合并的AUROC最高（平均AUROC为0.951,95%可信区间为（0.949,0.952））。虽然几个设置和数据分割（28800个中的28个）产生了测试AUROC为1，但它们是不可复制的。结论：我们的实验表明，放射组学管道中的变异性来源（例如，肿瘤亚区）可能对结果产生重大影响，这可能导致表面上的完美表现不可复制。临床试验号：不适用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.60

自引率

3.70%

发文量

198

审稿时长

27 weeks

期刊介绍： BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.