{"title":"两相故障-时间-辅助相关采样设计的半参数推理。","authors":"Xu Cao, Qingning Zhou, Jianwen Cai, Haibo Zhou","doi":"10.1002/sim.70239","DOIUrl":null,"url":null,"abstract":"<p><p>Large cohort studies under simple random sampling could be prohibitive to conduct for epidemiological studies with a limited budget, especially when exposure variables are expensive or hard to obtain. Failure-time-dependent sampling (FDS) is a commonly used cost-effective sampling strategy for studies with failure times as outcomes. To further enhance study efficiency upon FDS, we propose a two-phase failure-time-auxiliary-dependent sampling (FADS) design that allows the probability of obtaining the expensive exposures to depend on both the failure time and some cheaply available auxiliary variables to the main exposure of interest. To account for the sampling bias, we develop a semiparametric maximum pseudo-likelihood approach for inference and a nonparametric bootstrap procedure for variance estimation. The proposed estimator of regression coefficients is shown to be consistent and asymptotically normally distributed. The simulation studies indicate that our proposed method works well in practical settings and is more efficient than other competing sampling schemes or methods. We illustrate our method with the analysis of two real data sets, the ARIC Study and the National Wilms' Tumor Study.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 18-19","pages":"e70239"},"PeriodicalIF":1.8000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semiparametric Inference for a Two-Phase Failure-Time-Auxiliary-Dependent Sampling Design.\",\"authors\":\"Xu Cao, Qingning Zhou, Jianwen Cai, Haibo Zhou\",\"doi\":\"10.1002/sim.70239\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Large cohort studies under simple random sampling could be prohibitive to conduct for epidemiological studies with a limited budget, especially when exposure variables are expensive or hard to obtain. Failure-time-dependent sampling (FDS) is a commonly used cost-effective sampling strategy for studies with failure times as outcomes. To further enhance study efficiency upon FDS, we propose a two-phase failure-time-auxiliary-dependent sampling (FADS) design that allows the probability of obtaining the expensive exposures to depend on both the failure time and some cheaply available auxiliary variables to the main exposure of interest. To account for the sampling bias, we develop a semiparametric maximum pseudo-likelihood approach for inference and a nonparametric bootstrap procedure for variance estimation. The proposed estimator of regression coefficients is shown to be consistent and asymptotically normally distributed. The simulation studies indicate that our proposed method works well in practical settings and is more efficient than other competing sampling schemes or methods. We illustrate our method with the analysis of two real data sets, the ARIC Study and the National Wilms' Tumor Study.</p>\",\"PeriodicalId\":21879,\"journal\":{\"name\":\"Statistics in Medicine\",\"volume\":\"44 18-19\",\"pages\":\"e70239\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics in Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/sim.70239\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.70239","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Semiparametric Inference for a Two-Phase Failure-Time-Auxiliary-Dependent Sampling Design.
Large cohort studies under simple random sampling could be prohibitive to conduct for epidemiological studies with a limited budget, especially when exposure variables are expensive or hard to obtain. Failure-time-dependent sampling (FDS) is a commonly used cost-effective sampling strategy for studies with failure times as outcomes. To further enhance study efficiency upon FDS, we propose a two-phase failure-time-auxiliary-dependent sampling (FADS) design that allows the probability of obtaining the expensive exposures to depend on both the failure time and some cheaply available auxiliary variables to the main exposure of interest. To account for the sampling bias, we develop a semiparametric maximum pseudo-likelihood approach for inference and a nonparametric bootstrap procedure for variance estimation. The proposed estimator of regression coefficients is shown to be consistent and asymptotically normally distributed. The simulation studies indicate that our proposed method works well in practical settings and is more efficient than other competing sampling schemes or methods. We illustrate our method with the analysis of two real data sets, the ARIC Study and the National Wilms' Tumor Study.
期刊介绍:
The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.