Performance Comparison of Machine Learning Using Radiomic Features and CNN-Based Deep Learning in Benign and Malignant Classification of Vertebral Compression Fractures Using CT Scans.

Jong Chan Yeom, So Hyun Park, Young Jae Kim, Tae Ran Ahn, Kwang Gi Kim
{"title":"Performance Comparison of Machine Learning Using Radiomic Features and CNN-Based Deep Learning in Benign and Malignant Classification of Vertebral Compression Fractures Using CT Scans.","authors":"Jong Chan Yeom, So Hyun Park, Young Jae Kim, Tae Ran Ahn, Kwang Gi Kim","doi":"10.1007/s10278-025-01553-z","DOIUrl":null,"url":null,"abstract":"<p><p>Distinguishing benign from malignant vertebral compression fractures is critical for clinical management but remains challenging on contrast-enhanced abdominal CT, which lacks the soft tissue contrast of MRI. This study evaluates and compares radiomic feature-based machine learning and convolutional neural network-based deep learning models for classifying VCFs using abdominal CT. A retrospective cohort of 447 vertebral compression fractures (196 benign, 251 malignant) from 286 patients was analyzed. Radiomic features were extracted using PyRadiomics, with Recursive Feature Elimination selecting six key texture-based features (e.g., Run Variance, Dependence Non-Uniformity Normalized), highlighting textural heterogeneity as a malignancy marker. Machine learning models (XGBoost, SVM, KNN, Random Forest) and a 3D CNN were trained on CT data, with performance assessed via precision, recall, F1 score, accuracy, and AUC. The deep learning model achieved marginally superior overall performance, with a statistically significant higher AUC (77.66% vs. 75.91%, p < 0.05) and better precision, F1 score, and accuracy compared to the top-performing machine learning model (XGBoost). Deep learning's attention maps localized diagnostically relevant regions, mimicking radiologists' focus, whereas radiomics lacked spatial interpretability despite offering quantifiable biomarkers. This study underscores the complementary strengths of machine learning and deep learning: radiomics provides interpretable features tied to tumor heterogeneity, while DL autonomously extracts high-dimensional patterns with spatial explainability. Integrating both approaches could enhance diagnostic accuracy and clinician trust in abdominal CT-based VCF assessment. Limitations include retrospective single-center data and potential selection bias. Future multi-center studies with diverse protocols and histopathological validation are warranted to generalize these findings.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01553-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Distinguishing benign from malignant vertebral compression fractures is critical for clinical management but remains challenging on contrast-enhanced abdominal CT, which lacks the soft tissue contrast of MRI. This study evaluates and compares radiomic feature-based machine learning and convolutional neural network-based deep learning models for classifying VCFs using abdominal CT. A retrospective cohort of 447 vertebral compression fractures (196 benign, 251 malignant) from 286 patients was analyzed. Radiomic features were extracted using PyRadiomics, with Recursive Feature Elimination selecting six key texture-based features (e.g., Run Variance, Dependence Non-Uniformity Normalized), highlighting textural heterogeneity as a malignancy marker. Machine learning models (XGBoost, SVM, KNN, Random Forest) and a 3D CNN were trained on CT data, with performance assessed via precision, recall, F1 score, accuracy, and AUC. The deep learning model achieved marginally superior overall performance, with a statistically significant higher AUC (77.66% vs. 75.91%, p < 0.05) and better precision, F1 score, and accuracy compared to the top-performing machine learning model (XGBoost). Deep learning's attention maps localized diagnostically relevant regions, mimicking radiologists' focus, whereas radiomics lacked spatial interpretability despite offering quantifiable biomarkers. This study underscores the complementary strengths of machine learning and deep learning: radiomics provides interpretable features tied to tumor heterogeneity, while DL autonomously extracts high-dimensional patterns with spatial explainability. Integrating both approaches could enhance diagnostic accuracy and clinician trust in abdominal CT-based VCF assessment. Limitations include retrospective single-center data and potential selection bias. Future multi-center studies with diverse protocols and histopathological validation are warranted to generalize these findings.

基于放射学特征的机器学习与基于cnn的深度学习在CT扫描椎体压缩性骨折良恶性分类中的性能比较
区分椎体压缩性骨折的良恶性对临床治疗至关重要,但由于缺乏MRI的软组织对比,在增强腹部CT上仍然具有挑战性。本研究评估并比较了基于放射学特征的机器学习和基于卷积神经网络的深度学习模型在腹部CT vcf分类中的应用。回顾性分析286例患者的447例椎体压缩性骨折(196例为良性,251例为恶性)。使用PyRadiomics提取放射组学特征,递归特征消除选择六个关键的基于纹理的特征(例如,运行方差,依赖性非均匀性归一化),突出纹理异质性作为恶性标记。机器学习模型(XGBoost, SVM, KNN, Random Forest)和3D CNN在CT数据上进行训练,并通过精度,召回率,F1分数,准确度和AUC来评估性能。深度学习模型取得了略微优越的整体性能,具有统计学上显着更高的AUC (77.66% vs. 75.91%, p
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信