{"title":"Adaptive Weight Selection for Time-To-Event Data Under Non-Proportional Hazards.","authors":"Moritz Fabian Danzer, Ina Dormuth","doi":"10.1002/sim.70045","DOIUrl":"10.1002/sim.70045","url":null,"abstract":"<p><p>When planning a clinical trial for a time-to-event endpoint, we require an estimated effect size and need to consider the type of effect. Usually, an effect of proportional hazards is assumed with the hazard ratio as the corresponding effect measure. Thus, the standard procedure for survival data is generally based on a single-stage log-rank test. Knowing that the assumption of proportional hazards is often violated and sufficient knowledge to derive reasonable effect sizes is usually unavailable, such an approach is relatively rigid. We introduce a more flexible procedure by combining two methods designed to be more robust in case we have little to no prior knowledge. First, we employ a more flexible adaptive multi-stage design instead of a single-stage design. Second, we apply combination-type tests in the first stage of our suggested procedure to benefit from their robustness under uncertainty about the deviation pattern. We can then use the data collected during this period to choose a more specific single-weighted log-rank test for the subsequent stages. In this step, we employ Royston-Parmar spline models to extrapolate the survival curves to make a reasonable decision. Based on a real-world data example, we show that our approach can save a trial that would otherwise end with an inconclusive result. Additionally, our simulation studies demonstrate a sufficient power performance while maintaining more flexibility.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70045"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuan Wu, Ryan A Simmons, Baoshan Zhang, Jesse D Troy
{"title":"Group Sequential Test for Two-Sample Ordinal Outcome Measures.","authors":"Yuan Wu, Ryan A Simmons, Baoshan Zhang, Jesse D Troy","doi":"10.1002/sim.70053","DOIUrl":"10.1002/sim.70053","url":null,"abstract":"<p><p>Group sequential trials include interim monitoring points to potentially reach futility or efficacy decisions early. This approach to trial design can safeguard patients, provide efficacious treatments for patients early, and save money and time. Group sequential methods are well developed for bell-shaped continuous, binary, and time-to-event outcomes. In this paper, we propose a group sequential design using the Mann-Whitney-Wilcoxon test for general two-sample ordinal data. We establish that the proposed test statistic has asymptotic normality and that sequential statistics satisfy the assumptions of Brownian motion. We also include results of finite sample simulation studies that show our proposed approach has the advantage over existing methods for controlling Type I errors while maintaining power for small sample sizes. A real data set is used to illustrate the proposed method and a sample size calculation approach is proposed for designing new studies.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70053"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11925493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sequential Monitoring of Covariate-Adaptive Randomized Clinical Trials With Non-Parametric Approaches.","authors":"Xiaotian Chen, Jun Yu, Hongjian Zhu, Li Wang","doi":"10.1002/sim.70042","DOIUrl":"https://doi.org/10.1002/sim.70042","url":null,"abstract":"<p><p>The importance of covariate adjustment in clinical trials has been underscored by the U.S. FDA's guidance. Inference, with or without covariates, after implementing covariate adaptive randomization (CAR), is garnering increased interest. This paper investigates the sequential monitoring of covariate-adaptive randomized clinical trials through non-parametric methods, a critical advancement for enhancing the precision and efficiency of medical research. CAR, which incorporates baseline patient characteristics into the randomization process, aims to mitigate the risk of confounding and improve the balance of covariates across treatment groups, thereby addressing patients' heterogeneity. Although CAR is known for its benefits in reducing biases and enhancing statistical power, its integration into sequentially monitored clinical trials-a standard practice-poses methodological challenges, particularly in controlling the type I error rate. By employing a non-parametric approach, we demonstrate through theoretical proofs and numerical analyses that our methods effectively control the type I error rate and surpass traditional randomization and analysis methods. This paper not only fills a gap in the literature on sequential monitoring of CAR without model misspecification but also proposes practical solutions for enhancing trial design and analysis, thereby contributing significantly to the field of clinical research.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70042"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143664450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ensemble of Sequential Learning Models With Distributed Data Centers and Its Applications.","authors":"Zhanfeng Wang, Jingyu Huang, Yuan-Chin Ivan Chang","doi":"10.1002/sim.70002","DOIUrl":"https://doi.org/10.1002/sim.70002","url":null,"abstract":"<p><p>Handling massive datasets poses a significant challenge in modern data analysis, particularly within epidemiology and medicine. In this study, we introduce a novel approach using sequential ensemble learning to effectively analyze extensive datasets. Our method prioritizes efficiency from both statistical and computational perspectives, addressing challenges such as data communication and privacy, as discussed in federated learning literature. To demonstrate the efficacy of our approach, we present compelling real-world examples using COVID-19 data alongside simulation studies.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70002"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification and Estimation of Causal Effects Using Non-Concurrent Controls in Platform Trials.","authors":"Michele Santacatterina, Federico Macchiavelli Giron, Xinyi Zhang, Iván Díaz","doi":"10.1002/sim.70017","DOIUrl":"https://doi.org/10.1002/sim.70017","url":null,"abstract":"<p><p>Platform trials are multi-arm designs that simultaneously evaluate multiple treatments for a single disease within the same overall trial structure. Unlike traditional randomized controlled trials, they allow treatment arms to enter and exit the trial at distinct times while maintaining a control arm throughout. This control arm comprises both concurrent controls, where participants are randomized concurrently to either the treatment or control arm, and non-concurrent controls, who enter the trial when the treatment arm under study is unavailable. While flexible, platform trials introduce the challenge of using non-concurrent controls, raising questions about estimating treatment effects. Specifically, which estimands should be targeted? Under what assumptions can these estimands be identified and estimated? Are there any efficiency gains? In this article, we discuss issues related to the identification and estimation assumptions of common choices of estimand. We conclude that the most robust strategy to increase efficiency without imposing unwarranted assumptions is to target the concurrent average treatment effect (cATE), the ATE among only concurrent units, using a covariate-adjusted doubly robust estimator. Our studies suggest that, for the purpose of obtaining efficiency gains, collecting important prognostic variables is more important than relying on non-concurrent controls. We also discuss the perils of targeting ATE due to an untestable extrapolation assumption that will often be invalid. We provide simulations illustrating our points and an application to the ACTT platform trial, resulting in a 20% improvement in precision compared to the naive estimator that ignores non-concurrent controls and prognostic variables.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70017"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation and Hypothesis Testing of Strain-Specific Vaccine Efficacy With Missing Strain Types With Application to a COVID-19 Vaccine Trial.","authors":"Fei Heng, Yanqing Sun, Li Li, Peter B Gilbert","doi":"10.1002/sim.10345","DOIUrl":"10.1002/sim.10345","url":null,"abstract":"<p><p>Based on data from a randomized, controlled vaccine efficacy trial, this article develops statistical methods for assessing vaccine efficacy (VE) to prevent COVID-19 infections by a discrete set of genetic strains of SARS-CoV-2. Strain-specific VE adjusting for possibly time-varying covariates is estimated using augmented inverse probability weighting to address missing viral genotypes under a competing risks model that allows separate baseline hazards for different risk groups. Hypothesis tests are developed to assess whether the vaccine provides at least a specified level of VE against some viral genotypes and whether VE varies across genotypes. Asymptotic properties providing analytic inferences are derived and finite-sample properties of the estimators and hypothesis tests are studied through simulations. This research is motivated by the fact that previous analyses of COVID-19 vaccine efficacy did not account for missing genotypes, which can cause severe bias and efficiency loss. The theoretical properties and simulations demonstrate superior performance of the new methods. Application to the Moderna COVE trial identifies several SARS-CoV-2 genotype features with differential vaccine efficacy across genotypes, including lineage (Reference, Epsilon, Gamma, Zeta), indicators of residue match vs. mismatch to the vaccine-strain residue at Spike amino acid positions (identifying signatures of differential VE), and a weighted Hamming distance to the vaccine strain. The results show VE decreases against genotypes more distant from the vaccine strain, highlighting the need to update COVID-19 vaccine strains.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e10345"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11906172/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143606528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incorporating Additional Evidence as Prior Information to Resolve Non-Identifiability in Bayesian Disease Model Calibration: A Tutorial.","authors":"Daria Semochkina, Cathal D Walsh","doi":"10.1002/sim.70039","DOIUrl":"10.1002/sim.70039","url":null,"abstract":"<p><p>Disease models are used to examine the likely impact of therapies, interventions, and public policy changes. Ensuring that these are well calibrated on the basis of available data and that the uncertainty in their projections is properly quantified is an important part of the process. The question of non-identifiability poses a challenge to disease model calibration where multiple parameter sets generate identical model outputs. For statisticians evaluating the impact of policy interventions such as screening or vaccination, this is a critical issue. This study explores the use of the Bayesian framework to provide a natural way to calibrate models and address non-identifiability in a probabilistic fashion in the context of disease modeling. We present Bayesian approaches for incorporating expert knowledge and external data to ensure that appropriately informative priors are specified on the joint parameter space. These approaches are applied to two common disease models: a basic susceptible-infected-susceptible (SIS) model and a much more complex agent-based model which has previously been used to address public policy questions in HPV and cervical cancer. The conditions that allow the problem of non-identifiability to be resolved are demonstrated for the SIS model. For the larger HPV model, an overview of the findings is presented, but of key importance is a discussion on how the non-identifiability impacts the calibration process. Through case studies, we demonstrate how informative priors can help resolve non-identifiability and improve model inference. We also discuss how sensitivity analysis can be used to assess the impact of prior specifications on model results. Overall, this work provides an important tutorial for researchers interested in applying Bayesian methods to calibrate models and handle non-identifiability in disease models.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70039"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11915782/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143658712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variable Selection for Progressive Multistate Processes Under Intermittent Observation.","authors":"Xianwei Li, Richard J Cook, Liqun Diao","doi":"10.1002/sim.70023","DOIUrl":"10.1002/sim.70023","url":null,"abstract":"<p><p>Multistate models offer a natural framework for studying many chronic disease processes. Interest often lies in identifying which among a large list of candidate variables play a role in the progression of such processes. We consider the problem of variable selection for progressive multistate processes under intermittent observation based on penalized log-likelihood. An Expectation-Maximization (EM) algorithm is developed such that the maximization step can exploit existing software for penalized Poisson regression thereby allowing for the use of common penalty functions. Simulation studies show good performance in identifying important markers with different penalty functions. In a motivating application involving a cohort of patients with psoriatic arthritis, we identify which, among a large group of candidate HLA markers, are associated with rapid disease progression.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70023"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11924175/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143664377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Connor Robertson, Cosmin Safta, Nicholson Collier, Jonathan Ozik, Jaideep Ray
{"title":"Bayesian Calibration of Stochastic Agent Based Model via Random Forest.","authors":"Connor Robertson, Cosmin Safta, Nicholson Collier, Jonathan Ozik, Jaideep Ray","doi":"10.1002/sim.70029","DOIUrl":"https://doi.org/10.1002/sim.70029","url":null,"abstract":"<p><p>Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high-dimensional calibration can be computationally prohibitive. This paper presents a random forest-based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented, and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results, and their predictive performance is analyzed, showing improved performance with a reduction in computation.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70029"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Denis Talbot, Awa Diop, Miceline Mésidor, Yohann Chiu, Caroline Sirois, Andrew J Spieker, Antoine Pariente, Pernelle Noize, Marc Simard, Miguel Angel Luque Fernandez, Michael Schomaker, Kenji Fujita, Danijela Gnjidic, Mireille E Schnitzer
{"title":"Guidelines and Best Practices for the Use of Targeted Maximum Likelihood and Machine Learning When Estimating Causal Effects of Exposures on Time-To-Event Outcomes.","authors":"Denis Talbot, Awa Diop, Miceline Mésidor, Yohann Chiu, Caroline Sirois, Andrew J Spieker, Antoine Pariente, Pernelle Noize, Marc Simard, Miguel Angel Luque Fernandez, Michael Schomaker, Kenji Fujita, Danijela Gnjidic, Mireille E Schnitzer","doi":"10.1002/sim.70034","DOIUrl":"10.1002/sim.70034","url":null,"abstract":"<p><p>Targeted maximum likelihood estimation (TMLE) is an increasingly popular framework for the estimation of causal effects. It requires modeling both the exposure and outcome but is doubly robust in the sense that it is valid if at least one of these models is correctly specified. In addition, TMLE allows for flexible modeling of both the exposure and outcome with machine learning methods. This provides better control for measured confounders since the model specification automatically adapts to the data, instead of needing to be specified by the analyst a priori. Despite these methodological advantages, TMLE remains less popular than alternatives in part because of its less accessible theory and implementation. While some tutorials have been proposed, none address the case of a time-to-event outcome. This tutorial provides a detailed step-by-step explanation of the implementation of TMLE for estimating the effect of a point binary or multilevel exposure on a time-to-event outcome, modeled as counterfactual survival curves and causal hazard ratios. The tutorial also provides guidelines on how best to use TMLE in practice, including aspects related to study design, choice of covariates, controlling biases and use of machine learning. R-code is provided to illustrate each step using simulated data ( https://github.com/detal9/SurvTMLE). To facilitate implementation, a general R function implementing TMLE with options to use machine learning is also provided. The method is illustrated in a real-data analysis concerning the effectiveness of statins for the prevention of a first cardiovascular disease among older adults in Québec, Canada, between 2013 and 2018.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70034"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11905698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}