Extrapolation before imputation reduces bias when imputing censored covariates.

IF 1.8 2区 数学 Q2 STATISTICS & PROBABILITY
Sarah C Lotspeich, Tanya P Garcia
{"title":"Extrapolation before imputation reduces bias when imputing censored covariates.","authors":"Sarah C Lotspeich, Tanya P Garcia","doi":"10.1080/10618600.2024.2444323","DOIUrl":null,"url":null,"abstract":"<p><p>Modeling symptom progression to identify ideal subjects for a Huntington's disease clinical trial is problematic since time to diagnosis, a key covariate, can be heavily censored. Imputation is an appealing strategy that replaces the censored covariate with its conditional mean, but existing methods saw over 200% bias under heavy censoring. Calculating conditional means well requires estimating and then integrating over the survival function of the censored covariate from the censored value to infinity. To estimate the survival function flexibly, existing methods use the semiparametric Cox model with Breslow's estimator, leaving the integrand for the conditional means (the survival function) undefined beyond the observed data. The integral is then estimated up to the largest observed covariate value, and this approximation can cut off the tail of the survival function and lead to severe bias. We combine the semiparametric survival estimator with a parametric extension to approximate the integral up to infinity. In simulations, our proposed extrapolation-before-imputation approach substantially reduces the bias seen with existing imputation methods, sometimes even when the parametric extension was misspecified. We further demonstrate how imputing with corrected conditional means can prioritize subjects for clinical trials. The R code to reproduce results is available in the Supplementary Material.</p>","PeriodicalId":15422,"journal":{"name":"Journal of Computational and Graphical Statistics","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12435536/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational and Graphical Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/10618600.2024.2444323","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Modeling symptom progression to identify ideal subjects for a Huntington's disease clinical trial is problematic since time to diagnosis, a key covariate, can be heavily censored. Imputation is an appealing strategy that replaces the censored covariate with its conditional mean, but existing methods saw over 200% bias under heavy censoring. Calculating conditional means well requires estimating and then integrating over the survival function of the censored covariate from the censored value to infinity. To estimate the survival function flexibly, existing methods use the semiparametric Cox model with Breslow's estimator, leaving the integrand for the conditional means (the survival function) undefined beyond the observed data. The integral is then estimated up to the largest observed covariate value, and this approximation can cut off the tail of the survival function and lead to severe bias. We combine the semiparametric survival estimator with a parametric extension to approximate the integral up to infinity. In simulations, our proposed extrapolation-before-imputation approach substantially reduces the bias seen with existing imputation methods, sometimes even when the parametric extension was misspecified. We further demonstrate how imputing with corrected conditional means can prioritize subjects for clinical trials. The R code to reproduce results is available in the Supplementary Material.

外推前的归因减少偏差时,归因剔除协变量。
建立症状进展模型以确定亨廷顿氏病临床试验的理想受试者是有问题的,因为诊断时间是一个关键的协变量,可能会受到严重的审查。Imputation是一种很有吸引力的策略,它用条件均值代替被审查的协变量,但现有的方法在严格审查下的偏差超过200%。计算好条件均值需要估计然后对被删减协变量的生存函数从被删减值到无穷积分。为了灵活地估计生存函数,现有方法使用带有Breslow估计量的半参数Cox模型,使条件均值(生存函数)的被积函数在观测数据之外未定义。然后将积分估计为最大观察到的协变量值,这种近似可以切断生存函数的尾部并导致严重的偏差。将半参数生存估计与参数扩展相结合,逼近到无穷。在模拟中,我们提出的外推法在输入之前大大减少了现有输入方法的偏差,有时甚至在参数扩展被错误指定时也是如此。我们进一步证明了如何用修正的条件手段推算可以优先考虑临床试验的受试者。复制结果的R代码可在补充材料中获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.50
自引率
8.30%
发文量
153
审稿时长
>12 weeks
期刊介绍: The Journal of Computational and Graphical Statistics (JCGS) presents the very latest techniques on improving and extending the use of computational and graphical methods in statistics and data analysis. Established in 1992, this journal contains cutting-edge research, data, surveys, and more on numerical graphical displays and methods, and perception. Articles are written for readers who have a strong background in statistics but are not necessarily experts in computing. Published in March, June, September, and December.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信