Optimizing radiomics for prostate cancer diagnosis: feature selection strategies, machine learning classifiers, and MRI sequences.

IF 4.1 2区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Eugenia Mylona, Dimitrios I Zaridis, Charalampos Ν Kalantzopoulos, Nikolaos S Tachos, Daniele Regge, Nikolaos Papanikolaou, Manolis Tsiknakis, Kostas Marias, Dimitrios I Fotiadis
{"title":"Optimizing radiomics for prostate cancer diagnosis: feature selection strategies, machine learning classifiers, and MRI sequences.","authors":"Eugenia Mylona, Dimitrios I Zaridis, Charalampos Ν Kalantzopoulos, Nikolaos S Tachos, Daniele Regge, Nikolaos Papanikolaou, Manolis Tsiknakis, Kostas Marias, Dimitrios I Fotiadis","doi":"10.1186/s13244-024-01783-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Radiomics-based analyses encompass multiple steps, leading to ambiguity regarding the optimal approaches for enhancing model performance. This study compares the effect of several feature selection methods, machine learning (ML) classifiers, and sources of radiomic features, on models' performance for the diagnosis of clinically significant prostate cancer (csPCa) from bi-parametric MRI.</p><p><strong>Methods: </strong>Two multi-centric datasets, with 465 and 204 patients each, were used to extract 1246 radiomic features per patient and MRI sequence. Ten feature selection methods, such as Boruta, mRMRe, ReliefF, recursive feature elimination (RFE), random forest (RF) variable importance, L1-lasso, etc., four ML classifiers, namely SVM, RF, LASSO, and boosted generalized linear model (GLM), and three sets of radiomics features, derived from T2w images, ADC maps, and their combination, were used to develop predictive models of csPCa. Their performance was evaluated in a nested cross-validation and externally, using seven performance metrics.</p><p><strong>Results: </strong>In total, 480 models were developed. In nested cross-validation, the best model combined Boruta with Boosted GLM (AUC = 0.71, F1 = 0.76). In external validation, the best model combined L1-lasso with boosted GLM (AUC = 0.71, F1 = 0.47). Overall, Boruta, RFE, L1-lasso, and RF variable importance were the top-performing feature selection methods, while the choice of ML classifier didn't significantly affect the results. The ADC-derived features showed the highest discriminatory power with T2w-derived features being less informative, while their combination did not lead to improved performance.</p><p><strong>Conclusion: </strong>The choice of feature selection method and the source of radiomic features have a profound effect on the models' performance for csPCa diagnosis.</p><p><strong>Critical relevance statement: </strong>This work may guide future radiomic research, paving the way for the development of more effective and reliable radiomic models; not only for advancing prostate cancer diagnostic strategies, but also for informing broader applications of radiomics in different medical contexts.</p><p><strong>Key points: </strong>Radiomics is a growing field that can still be optimized. Feature selection method impacts radiomics models' performance more than ML algorithms. Best feature selection methods: RFE, LASSO, RF, and Boruta. ADC-derived radiomic features yield more robust models compared to T2w-derived radiomic features.</p>","PeriodicalId":13639,"journal":{"name":"Insights into Imaging","volume":"15 1","pages":"265"},"PeriodicalIF":4.1000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535140/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Insights into Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13244-024-01783-9","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Radiomics-based analyses encompass multiple steps, leading to ambiguity regarding the optimal approaches for enhancing model performance. This study compares the effect of several feature selection methods, machine learning (ML) classifiers, and sources of radiomic features, on models' performance for the diagnosis of clinically significant prostate cancer (csPCa) from bi-parametric MRI.

Methods: Two multi-centric datasets, with 465 and 204 patients each, were used to extract 1246 radiomic features per patient and MRI sequence. Ten feature selection methods, such as Boruta, mRMRe, ReliefF, recursive feature elimination (RFE), random forest (RF) variable importance, L1-lasso, etc., four ML classifiers, namely SVM, RF, LASSO, and boosted generalized linear model (GLM), and three sets of radiomics features, derived from T2w images, ADC maps, and their combination, were used to develop predictive models of csPCa. Their performance was evaluated in a nested cross-validation and externally, using seven performance metrics.

Results: In total, 480 models were developed. In nested cross-validation, the best model combined Boruta with Boosted GLM (AUC = 0.71, F1 = 0.76). In external validation, the best model combined L1-lasso with boosted GLM (AUC = 0.71, F1 = 0.47). Overall, Boruta, RFE, L1-lasso, and RF variable importance were the top-performing feature selection methods, while the choice of ML classifier didn't significantly affect the results. The ADC-derived features showed the highest discriminatory power with T2w-derived features being less informative, while their combination did not lead to improved performance.

Conclusion: The choice of feature selection method and the source of radiomic features have a profound effect on the models' performance for csPCa diagnosis.

Critical relevance statement: This work may guide future radiomic research, paving the way for the development of more effective and reliable radiomic models; not only for advancing prostate cancer diagnostic strategies, but also for informing broader applications of radiomics in different medical contexts.

Key points: Radiomics is a growing field that can still be optimized. Feature selection method impacts radiomics models' performance more than ML algorithms. Best feature selection methods: RFE, LASSO, RF, and Boruta. ADC-derived radiomic features yield more robust models compared to T2w-derived radiomic features.

优化前列腺癌诊断的放射组学:特征选择策略、机器学习分类器和磁共振成像序列。
目的:基于放射组学的分析包含多个步骤,导致提高模型性能的最佳方法不明确。本研究比较了几种特征选择方法、机器学习(ML)分类器和放射组学特征来源对模型性能的影响,以诊断双参数 MRI 中具有临床意义的前列腺癌(csPCa):使用两个多中心数据集,每个数据集有 465 名和 204 名患者,提取每个患者和每个 MRI 序列的 1246 个放射学特征。利用 Boruta、mRMRe、ReliefF、递归特征消除(RFE)、随机森林(RF)变量重要性、L1-lasso 等十种特征选择方法,SVM、RF、LASSO 和增强广义线性模型(GLM)等四种 ML 分类器,以及从 T2w 图像、ADC 图和它们的组合中提取的三组放射组学特征,来开发 csPCa 的预测模型。在嵌套交叉验证和外部评估中,使用七个性能指标对这些模型的性能进行了评估:结果:总共开发了 480 个模型。在嵌套交叉验证中,最佳模型结合了 Boruta 和 Boosted GLM(AUC = 0.71,F1 = 0.76)。在外部验证中,最佳模型是 L1-lasso 与增强 GLM 的组合(AUC = 0.71,F1 = 0.47)。总体而言,Boruta、RFE、L1-lasso 和 RF 变量重要性是表现最好的特征选择方法,而 ML 分类器的选择对结果没有显著影响。ADC派生特征显示出最高的判别能力,而T2w派生特征的信息量较小,但它们的组合并没有提高性能:结论:特征选择方法和放射学特征来源的选择对 csPCa 诊断模型的性能有深远影响:这项工作可能会指导未来的放射组学研究,为开发更有效、更可靠的放射组学模型铺平道路;这不仅有助于推进前列腺癌诊断策略,还能为放射组学在不同医疗环境中的更广泛应用提供信息:放射组学是一个不断发展的领域,仍有待优化。特征选择方法对放射组学模型性能的影响大于 ML 算法。最佳特征选择方法:RFE、LASSO、RF 和 Boruta。与 T2w 导出的放射组学特征相比,ADC 导出的放射组学特征能产生更稳健的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Insights into Imaging
Insights into Imaging Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
7.30
自引率
4.30%
发文量
182
审稿时长
13 weeks
期刊介绍: Insights into Imaging (I³) is a peer-reviewed open access journal published under the brand SpringerOpen. All content published in the journal is freely available online to anyone, anywhere! I³ continuously updates scientific knowledge and progress in best-practice standards in radiology through the publication of original articles and state-of-the-art reviews and opinions, along with recommendations and statements from the leading radiological societies in Europe. Founded by the European Society of Radiology (ESR), I³ creates a platform for educational material, guidelines and recommendations, and a forum for topics of controversy. A balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes I³ an indispensable source for current information in this field. I³ is owned by the ESR, however authors retain copyright to their article according to the Creative Commons Attribution License (see Copyright and License Agreement). All articles can be read, redistributed and reused for free, as long as the author of the original work is cited properly. The open access fees (article-processing charges) for this journal are kindly sponsored by ESR for all Members. The journal went open access in 2012, which means that all articles published since then are freely available online.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信