Comparative Evaluation of Radiomics and Deep Learning Models for Disease Detection in Chest Radiography.

Journal of imaging informatics in medicine Pub Date : 2025-09-23 DOI:10.1007/s10278-025-01670-9

Zhijin He, Alan B McMillan

{"title":"Comparative Evaluation of Radiomics and Deep Learning Models for Disease Detection in Chest Radiography.","authors":"Zhijin He, Alan B McMillan","doi":"10.1007/s10278-025-01670-9","DOIUrl":null,"url":null,"abstract":"<p><p>The application of artificial intelligence (AI) in medical imaging has revolutionized diagnostic practices, enabling advanced analysis and interpretation of radiological data. This study presents a comprehensive evaluation of radiomics-based and deep learning-based approaches for disease detection in chest radiography, focusing on COVID-19, lung opacity, and viral pneumonia. While deep learning models, particularly convolutional neural networks (CNNs) and vision transformers (ViTs), learn directly from image data, radiomics-based models extract handcrafted features, offering potential advantages in data-limited scenarios. We systematically compared the diagnostic performance of various AI models, including Decision Trees, Gradient Boosting, Random Forests, Support Vector Machines (SVMs), and Multi-Layer Perceptrons (MLPs) for radiomics, against state-of-the-art deep learning models such as InceptionV3, EfficientNetL, and ConvNeXtXLarge. Performance was evaluated across multiple sample sizes. At 24 samples, EfficientNetL achieved an AUC of 0.839, outperforming SVM (AUC = 0.762). At 4000 samples, InceptionV3 achieved the highest AUC of 0.996, compared to 0.885 for Random Forest. A Scheirer-Ray-Hare test confirmed significant main and interaction effects of model type and sample size on all metrics. Post hoc Mann-Whitney U tests with Bonferroni correction further revealed consistent performance advantages for deep learning models across most conditions. These findings provide statistically validated, data-driven recommendations for model selection in diagnostic AI. Deep learning models demonstrated higher performance and better scalability with increasing data availability, while radiomics-based models may remain useful in low-data contexts. This study addresses a critical gap in AI-based diagnostic research by offering practical guidance for deploying AI models across diverse clinical environments.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01670-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The application of artificial intelligence (AI) in medical imaging has revolutionized diagnostic practices, enabling advanced analysis and interpretation of radiological data. This study presents a comprehensive evaluation of radiomics-based and deep learning-based approaches for disease detection in chest radiography, focusing on COVID-19, lung opacity, and viral pneumonia. While deep learning models, particularly convolutional neural networks (CNNs) and vision transformers (ViTs), learn directly from image data, radiomics-based models extract handcrafted features, offering potential advantages in data-limited scenarios. We systematically compared the diagnostic performance of various AI models, including Decision Trees, Gradient Boosting, Random Forests, Support Vector Machines (SVMs), and Multi-Layer Perceptrons (MLPs) for radiomics, against state-of-the-art deep learning models such as InceptionV3, EfficientNetL, and ConvNeXtXLarge. Performance was evaluated across multiple sample sizes. At 24 samples, EfficientNetL achieved an AUC of 0.839, outperforming SVM (AUC = 0.762). At 4000 samples, InceptionV3 achieved the highest AUC of 0.996, compared to 0.885 for Random Forest. A Scheirer-Ray-Hare test confirmed significant main and interaction effects of model type and sample size on all metrics. Post hoc Mann-Whitney U tests with Bonferroni correction further revealed consistent performance advantages for deep learning models across most conditions. These findings provide statistically validated, data-driven recommendations for model selection in diagnostic AI. Deep learning models demonstrated higher performance and better scalability with increasing data availability, while radiomics-based models may remain useful in low-data contexts. This study addresses a critical gap in AI-based diagnostic research by offering practical guidance for deploying AI models across diverse clinical environments.

查看原文本刊更多论文

胸片疾病检测的放射组学和深度学习模型的比较评价。

人工智能（AI）在医学成像中的应用彻底改变了诊断实践，使放射数据的高级分析和解释成为可能。本研究对基于放射组学和基于深度学习的胸片疾病检测方法进行了综合评估，重点关注COVID-19、肺混浊和病毒性肺炎。虽然深度学习模型，特别是卷积神经网络（cnn）和视觉变压器（vit），直接从图像数据中学习，但基于放射学的模型提取手工特征，在数据有限的场景中提供潜在的优势。我们系统地比较了各种人工智能模型的诊断性能，包括决策树、梯度增强、随机森林、支持向量机（svm）和放射组学的多层感知器（mlp），以及最先进的深度学习模型，如InceptionV3、EfficientNetL和ConvNeXtXLarge。性能评估跨越多个样本量。在24个样本中，EfficientNetL的AUC为0.839，优于SVM （AUC = 0.762）。在4000个样本时，InceptionV3的AUC最高，为0.996，而Random Forest的AUC最高为0.885。Scheirer-Ray-Hare检验证实模型类型和样本量对所有指标有显著的主效应和交互效应。带有Bonferroni校正的事后Mann-Whitney U测试进一步揭示了深度学习模型在大多数情况下的一致性能优势。这些发现为诊断人工智能的模型选择提供了统计验证和数据驱动的建议。随着数据可用性的提高，深度学习模型表现出更高的性能和更好的可扩展性，而基于放射学的模型在低数据环境中可能仍然有用。本研究通过为在不同临床环境中部署人工智能模型提供实用指导，解决了基于人工智能的诊断研究中的一个关键空白。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of imaging informatics in medicine

自引率

0.00%

发文量