针对具有时变系数的特定原因危害建模的可扩展近似方法。

IF 16.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Accounts of Chemical Research Pub Date : 2022-04-01 Epub Date: 2022-01-29 DOI:10.1007/s10985-021-09544-2
Wenbo Wu, Jeremy M G Taylor, Andrew F Brouwer, Lingfeng Luo, Jian Kang, Hui Jiang, Kevin He
{"title":"针对具有时变系数的特定原因危害建模的可扩展近似方法。","authors":"Wenbo Wu, Jeremy M G Taylor, Andrew F Brouwer, Lingfeng Luo, Jian Kang, Hui Jiang, Kevin He","doi":"10.1007/s10985-021-09544-2","DOIUrl":null,"url":null,"abstract":"<p><p>Survival modeling with time-varying coefficients has proven useful in analyzing time-to-event data with one or more distinct failure types. When studying the cause-specific etiology of breast and prostate cancers using the large-scale data from the Surveillance, Epidemiology, and End Results (SEER) Program, we encountered two major challenges that existing methods for estimating time-varying coefficients cannot tackle. First, these methods, dependent on expanding the original data in a repeated measurement format, result in formidable time and memory consumption as the sample size escalates to over one million. In this case, even a well-configured workstation cannot accommodate their implementations. Second, when the large-scale data under analysis include binary predictors with near-zero variance (e.g., only 0.6% of patients in our SEER prostate cancer data had tumors regional to the lymph nodes), existing methods suffer from numerical instability due to ill-conditioned second-order information. The estimation accuracy deteriorates further with multiple competing risks. To address these issues, we propose a proximal Newton algorithm with a shared-memory parallelization scheme and tests of significance and nonproportionality for the time-varying effects. A simulation study shows that our scalable approach reduces the time and memory costs by orders of magnitude and enjoys improved estimation accuracy compared with alternative approaches. Applications to the SEER cancer data demonstrate the real-world performance of the proximal Newton algorithm.</p>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9201734/pdf/nihms-1801776.pdf","citationCount":"0","resultStr":"{\"title\":\"Scalable proximal methods for cause-specific hazard modeling with time-varying coefficients.\",\"authors\":\"Wenbo Wu, Jeremy M G Taylor, Andrew F Brouwer, Lingfeng Luo, Jian Kang, Hui Jiang, Kevin He\",\"doi\":\"10.1007/s10985-021-09544-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Survival modeling with time-varying coefficients has proven useful in analyzing time-to-event data with one or more distinct failure types. When studying the cause-specific etiology of breast and prostate cancers using the large-scale data from the Surveillance, Epidemiology, and End Results (SEER) Program, we encountered two major challenges that existing methods for estimating time-varying coefficients cannot tackle. First, these methods, dependent on expanding the original data in a repeated measurement format, result in formidable time and memory consumption as the sample size escalates to over one million. In this case, even a well-configured workstation cannot accommodate their implementations. Second, when the large-scale data under analysis include binary predictors with near-zero variance (e.g., only 0.6% of patients in our SEER prostate cancer data had tumors regional to the lymph nodes), existing methods suffer from numerical instability due to ill-conditioned second-order information. The estimation accuracy deteriorates further with multiple competing risks. To address these issues, we propose a proximal Newton algorithm with a shared-memory parallelization scheme and tests of significance and nonproportionality for the time-varying effects. A simulation study shows that our scalable approach reduces the time and memory costs by orders of magnitude and enjoys improved estimation accuracy compared with alternative approaches. Applications to the SEER cancer data demonstrate the real-world performance of the proximal Newton algorithm.</p>\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2022-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9201734/pdf/nihms-1801776.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10985-021-09544-2\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/1/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10985-021-09544-2","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

事实证明,使用时变系数建立生存模型有助于分析具有一种或多种不同失败类型的时间到事件数据。在利用监测、流行病学和最终结果(SEER)计划的大规模数据研究乳腺癌和前列腺癌的特异性病因时,我们遇到了现有的时变系数估计方法无法应对的两大挑战。首先,这些方法依赖于以重复测量的形式扩展原始数据,当样本量超过一百万时,时间和内存消耗巨大。在这种情况下,即使是配置良好的工作站也无法实现这些方法。其次,当所分析的大规模数据包括方差近乎为零的二元预测因子时(例如,在 SEER 前列腺癌数据中,只有 0.6% 的患者患有淋巴结区域性肿瘤),现有方法会因二阶信息条件不良而导致数值不稳定。当存在多种竞争风险时,估计精度会进一步下降。为了解决这些问题,我们提出了一种共享内存并行化方案的近似牛顿算法,并对时变效应进行显著性和非比例性检验。模拟研究表明,与其他方法相比,我们的可扩展方法将时间和内存成本降低了几个数量级,并提高了估计精度。对 SEER 癌症数据的应用证明了近牛顿算法的实际性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Scalable proximal methods for cause-specific hazard modeling with time-varying coefficients.

Scalable proximal methods for cause-specific hazard modeling with time-varying coefficients.

Survival modeling with time-varying coefficients has proven useful in analyzing time-to-event data with one or more distinct failure types. When studying the cause-specific etiology of breast and prostate cancers using the large-scale data from the Surveillance, Epidemiology, and End Results (SEER) Program, we encountered two major challenges that existing methods for estimating time-varying coefficients cannot tackle. First, these methods, dependent on expanding the original data in a repeated measurement format, result in formidable time and memory consumption as the sample size escalates to over one million. In this case, even a well-configured workstation cannot accommodate their implementations. Second, when the large-scale data under analysis include binary predictors with near-zero variance (e.g., only 0.6% of patients in our SEER prostate cancer data had tumors regional to the lymph nodes), existing methods suffer from numerical instability due to ill-conditioned second-order information. The estimation accuracy deteriorates further with multiple competing risks. To address these issues, we propose a proximal Newton algorithm with a shared-memory parallelization scheme and tests of significance and nonproportionality for the time-varying effects. A simulation study shows that our scalable approach reduces the time and memory costs by orders of magnitude and enjoys improved estimation accuracy compared with alternative approaches. Applications to the SEER cancer data demonstrate the real-world performance of the proximal Newton algorithm.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Accounts of Chemical Research
Accounts of Chemical Research 化学-化学综合
CiteScore
31.40
自引率
1.10%
发文量
312
审稿时长
2 months
期刊介绍: Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信