{"title":"Double-slicing assisted sufficient dimension reduction for high-dimensional censored data","authors":"Shanshan Ding, W. Qian, Lan Wang","doi":"10.1214/19-aos1880","DOIUrl":null,"url":null,"abstract":"This paper provides a unified framework and an efficient algorithm for analyzing high-dimensional survival data under weak modeling assumptions. In particular, it imposes neither parametric distributional assumption nor linear regression assumption. It only assumes that the survival time T depends on a high-dimensional covariate vector X through low-dimensional linear combinations of covariates ΓX. The censoring time is allowed to be conditionally independent of the survival time given the covariates. This general framework includes many popular parametric and semiparametric survival regression models as special cases. The proposed algorithm produces a number of practically useful outputs with theoretical guarantees, including a consistent estimate of the sufficient dimension reduction subspace of T |X, a uniformly consistent Kaplan-Meier type estimator of the conditional distribution function of T and a consistent estimator of the conditional quantile survival time. Our asymptotic results significantly extend the classical theory of sufficient dimension reduction for censored data (particularly that of Li et al. 1999) and the celebrated nonparametric Kaplan-Meier estimator to the setting where the number of covariates p diverges exponentially fast with the sample size n. We demonstrate the promising performance of the proposed new estimators through simulations and a real data example.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2132-2154"},"PeriodicalIF":3.2000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/19-aos1880","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 5
Abstract
This paper provides a unified framework and an efficient algorithm for analyzing high-dimensional survival data under weak modeling assumptions. In particular, it imposes neither parametric distributional assumption nor linear regression assumption. It only assumes that the survival time T depends on a high-dimensional covariate vector X through low-dimensional linear combinations of covariates ΓX. The censoring time is allowed to be conditionally independent of the survival time given the covariates. This general framework includes many popular parametric and semiparametric survival regression models as special cases. The proposed algorithm produces a number of practically useful outputs with theoretical guarantees, including a consistent estimate of the sufficient dimension reduction subspace of T |X, a uniformly consistent Kaplan-Meier type estimator of the conditional distribution function of T and a consistent estimator of the conditional quantile survival time. Our asymptotic results significantly extend the classical theory of sufficient dimension reduction for censored data (particularly that of Li et al. 1999) and the celebrated nonparametric Kaplan-Meier estimator to the setting where the number of covariates p diverges exponentially fast with the sample size n. We demonstrate the promising performance of the proposed new estimators through simulations and a real data example.
期刊介绍:
The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.