提高时间依赖生存模型预测性能的统计学习方法。

Hyungwoo Seo, Wonil Chung
{"title":"提高时间依赖生存模型预测性能的统计学习方法。","authors":"Hyungwoo Seo, Wonil Chung","doi":"10.1186/s44342-025-00050-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The COVID-19 pandemic has highlighted the need for survival models to assess risk factors and time-dependent effects in infectious diseases. However, the Cox proportional hazards (PH) model, which assumes constant covariate effects, struggles to capture disease dynamics. This underscores the need for advanced models that incorporate time-dependent coefficients and covariates for improved accuracy.</p><p><strong>Methods: </strong>To address the need for modeling time-dependent effects and covariates, we applied a stratified Cox PH model with multiple time intervals to better satisfy the PH assumption. We conducted simulations to evaluate the performance of machine learning and deep learning survival models, including random survival forest (RSF), DeepSurv, and DeepHit. To improve time-dependent effect estimation, we introduced a refined time-interval division and a weighted sum approach for integrated hazard ratios of COVID-19 variants. The event of interest was death, and the specific risk compared was the risk of death from the start of the study to either death or the last follow-up among infected versus uninfected individuals.</p><p><strong>Results: </strong>Our results showed that increasing the number of time intervals improved predictive accuracy. When the PH assumption held, the Cox PH model outperformed machine learning and deep learning models. Applying our approach to UK Biobank data, expanding time intervals from five to fifteen enhanced performance. The previously reported hazard ratio of 7.333 for the pre-Delta period was refined to 29.359 for the Early variant, 20.734 for EU1, and 4.079 for Alpha, revealing a decline in risk across variants.</p><p><strong>Conclusions: </strong>These findings suggest that refining time intervals improves the understanding of time-dependent effects in infectious diseases. Incorporating stratified intervals and advanced models enhances risk assessment and predictive accuracy for COVID-19 and other evolving diseases.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"23 1","pages":"19"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12400734/pdf/","citationCount":"0","resultStr":"{\"title\":\"Statistical learning methods for improving predictive performance in time-dependent survival models.\",\"authors\":\"Hyungwoo Seo, Wonil Chung\",\"doi\":\"10.1186/s44342-025-00050-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The COVID-19 pandemic has highlighted the need for survival models to assess risk factors and time-dependent effects in infectious diseases. However, the Cox proportional hazards (PH) model, which assumes constant covariate effects, struggles to capture disease dynamics. This underscores the need for advanced models that incorporate time-dependent coefficients and covariates for improved accuracy.</p><p><strong>Methods: </strong>To address the need for modeling time-dependent effects and covariates, we applied a stratified Cox PH model with multiple time intervals to better satisfy the PH assumption. We conducted simulations to evaluate the performance of machine learning and deep learning survival models, including random survival forest (RSF), DeepSurv, and DeepHit. To improve time-dependent effect estimation, we introduced a refined time-interval division and a weighted sum approach for integrated hazard ratios of COVID-19 variants. The event of interest was death, and the specific risk compared was the risk of death from the start of the study to either death or the last follow-up among infected versus uninfected individuals.</p><p><strong>Results: </strong>Our results showed that increasing the number of time intervals improved predictive accuracy. When the PH assumption held, the Cox PH model outperformed machine learning and deep learning models. Applying our approach to UK Biobank data, expanding time intervals from five to fifteen enhanced performance. The previously reported hazard ratio of 7.333 for the pre-Delta period was refined to 29.359 for the Early variant, 20.734 for EU1, and 4.079 for Alpha, revealing a decline in risk across variants.</p><p><strong>Conclusions: </strong>These findings suggest that refining time intervals improves the understanding of time-dependent effects in infectious diseases. Incorporating stratified intervals and advanced models enhances risk assessment and predictive accuracy for COVID-19 and other evolving diseases.</p>\",\"PeriodicalId\":94288,\"journal\":{\"name\":\"Genomics & informatics\",\"volume\":\"23 1\",\"pages\":\"19\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12400734/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics & informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s44342-025-00050-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics & informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s44342-025-00050-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:2019冠状病毒病大流行凸显了对生存模型的需求,以评估传染病的危险因素和时间依赖性影响。然而,假设恒定协变量效应的Cox比例风险(PH)模型难以捕捉疾病动态。这强调了需要先进的模型,包括时间相关系数和协变量,以提高精度。方法:为了解决建模时间依赖效应和协变量的需要,我们采用了具有多个时间间隔的分层Cox PH模型,以更好地满足PH假设。我们进行了模拟来评估机器学习和深度学习生存模型的性能,包括随机生存森林(RSF)、DeepSurv和DeepHit。为了改进时间依赖效应估计,我们引入了一种改进的时间间隔划分和加权和方法来计算COVID-19变异的综合风险比。感兴趣的事件是死亡,比较的具体风险是从研究开始到死亡或最后一次随访中感染和未感染个体的死亡风险。结果:我们的研究结果表明,增加时间间隔的数量可以提高预测的准确性。当PH假设成立时,Cox PH模型优于机器学习和深度学习模型。将我们的方法应用于英国生物银行数据,将时间间隔从5个扩展到15个,提高了性能。之前报告的delta前时期的风险比为7.333,而早期变异的风险比为29.359,EU1的风险比为20.734,Alpha的风险比为4.079,这表明变异的风险有所下降。结论:这些发现表明,细化时间间隔可以提高对传染病时间依赖性效应的理解。结合分层间隔和先进模型可提高COVID-19和其他不断演变的疾病的风险评估和预测准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Statistical learning methods for improving predictive performance in time-dependent survival models.

Statistical learning methods for improving predictive performance in time-dependent survival models.

Statistical learning methods for improving predictive performance in time-dependent survival models.

Statistical learning methods for improving predictive performance in time-dependent survival models.

Background: The COVID-19 pandemic has highlighted the need for survival models to assess risk factors and time-dependent effects in infectious diseases. However, the Cox proportional hazards (PH) model, which assumes constant covariate effects, struggles to capture disease dynamics. This underscores the need for advanced models that incorporate time-dependent coefficients and covariates for improved accuracy.

Methods: To address the need for modeling time-dependent effects and covariates, we applied a stratified Cox PH model with multiple time intervals to better satisfy the PH assumption. We conducted simulations to evaluate the performance of machine learning and deep learning survival models, including random survival forest (RSF), DeepSurv, and DeepHit. To improve time-dependent effect estimation, we introduced a refined time-interval division and a weighted sum approach for integrated hazard ratios of COVID-19 variants. The event of interest was death, and the specific risk compared was the risk of death from the start of the study to either death or the last follow-up among infected versus uninfected individuals.

Results: Our results showed that increasing the number of time intervals improved predictive accuracy. When the PH assumption held, the Cox PH model outperformed machine learning and deep learning models. Applying our approach to UK Biobank data, expanding time intervals from five to fifteen enhanced performance. The previously reported hazard ratio of 7.333 for the pre-Delta period was refined to 29.359 for the Early variant, 20.734 for EU1, and 4.079 for Alpha, revealing a decline in risk across variants.

Conclusions: These findings suggest that refining time intervals improves the understanding of time-dependent effects in infectious diseases. Incorporating stratified intervals and advanced models enhances risk assessment and predictive accuracy for COVID-19 and other evolving diseases.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信