ROBI: a Robust and Optimized Biomarker Identifier to increase the likelihood of discovering relevant radiomic features.

Louis Rebaud, Nicolo Capobianco, Clementine Sarkozy, Anne-Segolene Cottereau, Laetitia Vercellino, Olivier Casasnovas, Catherine Thieblemont, Bruce Spottiswoode, Irene Buvat
{"title":"ROBI: a Robust and Optimized Biomarker Identifier to increase the likelihood of discovering relevant radiomic features.","authors":"Louis Rebaud, Nicolo Capobianco, Clementine Sarkozy, Anne-Segolene Cottereau, Laetitia Vercellino, Olivier Casasnovas, Catherine Thieblemont, Bruce Spottiswoode, Irene Buvat","doi":"10.1101/2024.09.09.24313059","DOIUrl":null,"url":null,"abstract":"Objectives: The Robust and Optimized Biomarker Identifier (ROBI) feature selection pipeline is introduced to improve the identification of informative biomarkers coding information not already captured by existing features. It aims to accurately maximize the number of discoveries while minimizing and estimating the number of false positives (FP) with an adjustable selection stringency.\nMethods: 500 synthetic datasets and retrospective data of 378 Diffuse Large B Cell Lymphoma (DLBCL) patients were used for validation. On the DLBCL data, two established radiomic biomarkers, TMTV and Dmax, were measured from the 18F-FDG PET/CT scans, and 10,000 random ones were generated. Selection was performed and verified on each dataset. The efficacy of ROBI has been compared to methods controlling for multiple testing and a Cox model with Elasticnet penalty.\nResults: On synthetic datasets, ROBI selected significantly more true positives (TP) than FP (p < 0.001), and for 99.3% of datasets, the number of FP was within the estimated 95% confidence interval. ROBI significantly increased the number of TP compared to usual feature selection methods (p < 0.001). On retrospective data, ROBI selected the two established biomarkers and one random biomarker and estimated 95% chance of selecting 0 or 1 FP and a probability of 0.0014 of selecting only FP. Bonferroni correction selected no feature, and Elasticnet selected 101 spurious features and discarded TMTV.\nConclusion: ROBI selected relevant biomarkers while effectively controlling for FPs, outperforming conventional selection methods. This underscores its potential as a valuable asset for biomarker discovery.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.09.24313059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: The Robust and Optimized Biomarker Identifier (ROBI) feature selection pipeline is introduced to improve the identification of informative biomarkers coding information not already captured by existing features. It aims to accurately maximize the number of discoveries while minimizing and estimating the number of false positives (FP) with an adjustable selection stringency. Methods: 500 synthetic datasets and retrospective data of 378 Diffuse Large B Cell Lymphoma (DLBCL) patients were used for validation. On the DLBCL data, two established radiomic biomarkers, TMTV and Dmax, were measured from the 18F-FDG PET/CT scans, and 10,000 random ones were generated. Selection was performed and verified on each dataset. The efficacy of ROBI has been compared to methods controlling for multiple testing and a Cox model with Elasticnet penalty. Results: On synthetic datasets, ROBI selected significantly more true positives (TP) than FP (p < 0.001), and for 99.3% of datasets, the number of FP was within the estimated 95% confidence interval. ROBI significantly increased the number of TP compared to usual feature selection methods (p < 0.001). On retrospective data, ROBI selected the two established biomarkers and one random biomarker and estimated 95% chance of selecting 0 or 1 FP and a probability of 0.0014 of selecting only FP. Bonferroni correction selected no feature, and Elasticnet selected 101 spurious features and discarded TMTV. Conclusion: ROBI selected relevant biomarkers while effectively controlling for FPs, outperforming conventional selection methods. This underscores its potential as a valuable asset for biomarker discovery.
ROBI:稳健优化的生物标记识别器,提高发现相关放射学特征的可能性。
目标:引入鲁棒和优化生物标记物识别器(ROBI)特征选择管道,以改进对现有特征尚未捕获的信息编码生物标记物的识别。方法:使用 500 个合成数据集和 378 名弥漫大 B 细胞淋巴瘤(DLBCL)患者的回顾性数据进行验证。在 DLBCL 数据中,通过 18F-FDG PET/CT 扫描测量了两个已确立的放射生物标志物 TMTV 和 Dmax,并随机生成了 10,000 个数据集。对每个数据集进行筛选和验证。将 ROBI 的功效与控制多重测试的方法和带有 Elasticnet 惩罚的 Cox 模型进行了比较:在合成数据集上,ROBI 选择的真阳性(TP)明显多于假阳性(FP)(p <0.001),99.3% 的数据集的假阳性数量在估计的 95% 置信区间内。与通常的特征选择方法相比,ROBI 大大增加了 TP 的数量(p < 0.001)。在回顾性数据中,ROBI 选择了两个确定的生物标志物和一个随机生物标志物,估计选择 0 或 1 个 FP 的概率为 95%,只选择 FP 的概率为 0.0014。Bonferroni校正没有选中任何特征,Elasticnet选中了101个虚假特征并丢弃了TMTV:ROBI选择了相关的生物标记物,同时有效地控制了FP,优于传统的选择方法。这凸显了其作为生物标记物发现的宝贵资产的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信