SurvBoard: standardized benchmarking for multi-omics cancer survival models.

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics Pub Date : 2025-08-31 DOI:10.1093/bib/bbaf521

David Wissel, Nikita Janakarajan, Aayush Grover, Enrico Toniato, Maria Rodríguez Martínez, Valentina Boeva

{"title":"SurvBoard: standardized benchmarking for multi-omics cancer survival models.","authors":"David Wissel, Nikita Janakarajan, Aayush Grover, Enrico Toniato, Maria Rodríguez Martínez, Valentina Boeva","doi":"10.1093/bib/bbaf521","DOIUrl":null,"url":null,"abstract":"<p><p>Multi-omics data, which include genomic, transcriptomic, epigenetic, and proteomic data, are gaining increasing importance for determining the clinical outcomes of cancer patients. Several recent studies have evaluated various multimodal integration strategies for cancer survival prediction, highlighting the need for standardizing model performance results. Addressing this issue, we introduce SurvBoard, a benchmark framework that standardizes key experimental design choices. SurvBoard enables comparisons between single-cancer and pan-cancer data models and assesses the benefits of using patient data with missing modalities. We also address common pitfalls in preprocessing and validating multi-omics cancer survival models. We apply SurvBoard to several exemplary use cases, further confirming that statistical models tend to outperform deep learning methods, especially for metrics measuring survival function calibration. Moreover, most models exhibit better performance when trained in a pan-cancer context and can benefit from leveraging samples for which data of some omics modalities are missing. We provide a web service for model evaluation and to make our benchmark results easily accessible and viewable: https://www.survboard.science/. All code is available on GitHub: https://github.com/BoevaLab/survboard/. All benchmark outputs are available on Zenodo: 10.5281/zenodo.11066226. A video tutorial on how to use the Survboard leaderboard is available on YouTube at https://youtu.be/HJrdpJP8Vvk.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12486238/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf521","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-omics data, which include genomic, transcriptomic, epigenetic, and proteomic data, are gaining increasing importance for determining the clinical outcomes of cancer patients. Several recent studies have evaluated various multimodal integration strategies for cancer survival prediction, highlighting the need for standardizing model performance results. Addressing this issue, we introduce SurvBoard, a benchmark framework that standardizes key experimental design choices. SurvBoard enables comparisons between single-cancer and pan-cancer data models and assesses the benefits of using patient data with missing modalities. We also address common pitfalls in preprocessing and validating multi-omics cancer survival models. We apply SurvBoard to several exemplary use cases, further confirming that statistical models tend to outperform deep learning methods, especially for metrics measuring survival function calibration. Moreover, most models exhibit better performance when trained in a pan-cancer context and can benefit from leveraging samples for which data of some omics modalities are missing. We provide a web service for model evaluation and to make our benchmark results easily accessible and viewable: https://www.survboard.science/. All code is available on GitHub: https://github.com/BoevaLab/survboard/. All benchmark outputs are available on Zenodo: 10.5281/zenodo.11066226. A video tutorial on how to use the Survboard leaderboard is available on YouTube at https://youtu.be/HJrdpJP8Vvk.

查看原文本刊更多论文

SurvBoard：多组学癌症生存模型的标准化基准。

多组学数据，包括基因组学、转录组学、表观遗传学和蛋白质组学数据，对于确定癌症患者的临床结果越来越重要。最近的几项研究评估了癌症生存预测的各种多模式整合策略，强调了标准化模型性能结果的必要性。为了解决这个问题，我们介绍了SurvBoard，一个标准化关键实验设计选择的基准框架。SurvBoard可以比较单一癌症和泛癌症数据模型，并评估使用缺失模式的患者数据的益处。我们还解决了预处理和验证多组学癌症生存模型的常见缺陷。我们将SurvBoard应用于几个示例用例，进一步证实统计模型往往优于深度学习方法，特别是在测量生存函数校准的指标方面。此外，当在泛癌症背景下训练时，大多数模型表现出更好的性能，并且可以从利用某些组学模式数据缺失的样本中受益。我们提供了一个web服务用于模型评估，并使我们的基准测试结果易于访问和查看：https://www.survboard.science/。所有代码可在GitHub: https://github.com/BoevaLab/survboard/。所有基准测试输出都可以在Zenodo: 10.5281/ Zenodo .11066226上获得。关于如何使用冲浪板排行榜的视频教程可在YouTube上下载：https://youtu.be/HJrdpJP8Vvk。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Briefings in bioinformatics 生物-生化研究方法

CiteScore

13.20

自引率

13.70%

发文量

549

审稿时长

6 months

期刊介绍： Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.