An alternative to Cox’s regression for multiple survival curves comparison: A random forest-based approach using covariate structure

Lubomír Štěpánek, Filip Habarta, I. Malá, L. Marek
{"title":"An alternative to Cox’s regression for multiple survival curves comparison: A random forest-based approach using covariate structure","authors":"Lubomír Štěpánek, Filip Habarta, I. Malá, L. Marek","doi":"10.1109/ICCMA53594.2021.00029","DOIUrl":null,"url":null,"abstract":"There are several established methods for comparing more than two survival curves, namely the scale-rank test or Cox’s proportional hazard model. However, when their statistical assumptions are not met, their results’ validity is affected.In this study, we address the mentioned issue and propose a new statistical approach on how to compare more than two survival curves using a random forest algorithm, which is practically assumption-free. The repetitive generating of many decision trees covered by one random forest model enables to calculate of a proportion of trees with sufficient complexity classifying into all groups (depicted by their survival curves), which is the p-value estimate as an analogy of the classical Wald’s t-test output of the Cox’s regression. Furthermore, a level of the pruning of decision trees the random forest model is built with, can modify both the robustness and statistical power of the random forest alternative. The discussed results are confirmed using COVID-19 survival data with varying the tree pruning level.The introduced method for survival curves comparison, based on random forest algorithm, seems to be a valid alternative to Cox’s regression; however, it has no statistical assumptions and tends to reach higher statistical power.","PeriodicalId":131082,"journal":{"name":"2021 International Conference on Computing, Computational Modelling and Applications (ICCMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computing, Computational Modelling and Applications (ICCMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMA53594.2021.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

There are several established methods for comparing more than two survival curves, namely the scale-rank test or Cox’s proportional hazard model. However, when their statistical assumptions are not met, their results’ validity is affected.In this study, we address the mentioned issue and propose a new statistical approach on how to compare more than two survival curves using a random forest algorithm, which is practically assumption-free. The repetitive generating of many decision trees covered by one random forest model enables to calculate of a proportion of trees with sufficient complexity classifying into all groups (depicted by their survival curves), which is the p-value estimate as an analogy of the classical Wald’s t-test output of the Cox’s regression. Furthermore, a level of the pruning of decision trees the random forest model is built with, can modify both the robustness and statistical power of the random forest alternative. The discussed results are confirmed using COVID-19 survival data with varying the tree pruning level.The introduced method for survival curves comparison, based on random forest algorithm, seems to be a valid alternative to Cox’s regression; however, it has no statistical assumptions and tends to reach higher statistical power.
多生存曲线比较的Cox回归的替代方法:使用协变量结构的随机森林方法
有几种既定的方法可以比较两条以上的生存曲线,即scale-rank检验或Cox比例风险模型。然而,当他们的统计假设不满足时,他们的结果的有效性受到影响。在本研究中,我们解决了上述问题,并提出了一种新的统计方法,如何使用随机森林算法比较两条以上的生存曲线,这实际上是无假设的。由一个随机森林模型覆盖的许多决策树的重复生成使得能够计算出具有足够复杂性的树的比例,将其分类为所有组(由其生存曲线描述),这是p值估计,类似于Cox回归的经典Wald 's t检验输出。此外,建立随机森林模型的决策树的修剪水平可以修改随机森林方案的鲁棒性和统计能力。利用不同树木修剪水平的COVID-19存活数据证实了所讨论的结果。所介绍的基于随机森林算法的生存曲线比较方法似乎是Cox回归的有效替代方法;然而,它没有统计假设,往往达到更高的统计能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信