Marcel Wolbers, Mar Vázquez Rabuñal, Ke Li, Kaspar Rufibach, Daniel Sabanés Bové
{"title":"Using shrinkage methods to estimate treatment effects in overlapping subgroups in randomized clinical trials with a time-to-event endpoint.","authors":"Marcel Wolbers, Mar Vázquez Rabuñal, Ke Li, Kaspar Rufibach, Daniel Sabanés Bové","doi":"10.1177/09622802241313292","DOIUrl":"https://doi.org/10.1177/09622802241313292","url":null,"abstract":"<p><p>In randomized controlled trials, forest plots are frequently used to investigate the homogeneity of treatment effect estimates in pre-defined subgroups. However, the interpretation of subgroup-specific treatment effect estimates requires great care due to the smaller sample size of subgroups and the large number of investigated subgroups. Bayesian shrinkage methods have been proposed to address these issues, but they often focus on disjoint subgroups while subgroups displayed in forest plots are overlapping, i.e., each subject appears in multiple subgroups. In our proposed approach, we first build a flexible Cox model based on all available observations, including treatment-by-subgroup interaction terms for all subgroups of interest. We explore penalized partial likelihood estimation with lasso or ridge penalties for interaction terms, and Bayesian estimation with a regularized horseshoe prior. In a second step, the Cox model is marginalized to obtain treatment effect estimates for all subgroups. We illustrate these methods using data from a randomized clinical trial in follicular lymphoma and evaluate their properties in a simulation study. In all simulation scenarios, the overall mean-squared error is substantially smaller for penalized and shrinkage estimators compared to the standard subgroup-specific treatment effect estimator but leads to some bias for heterogeneous subgroups. A naive overall sample estimator also outperforms the standard subgroup-specific estimator in terms of the overall mean-squared error for all scenarios except for a scenario with substantial heterogeneity. We recommend that subgroup-specific estimators are routinely complemented by treatment effect estimators based on shrinkage methods. The proposed methods are implemented in the R package bonsaiforest.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241313292"},"PeriodicalIF":1.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143701487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Causal mediation analysis for time-to-event mediator and outcome in the presence of left truncation.","authors":"Jih-Chang Yu, Yen-Tsung Huang","doi":"10.1177/09622802241313291","DOIUrl":"https://doi.org/10.1177/09622802241313291","url":null,"abstract":"<p><p>We propose a causal mediation approach to semi-competing risks under left truncation sampling by considering an intermediate event as a mediator and a terminal event as an outcome. We focus on the causal relationship from exposure to the terminal outcome in relation to the intermediate event. In particular, we study the direct effect, the effect of exposure on the terminal event that is not through the intermediate event, and the indirect effect-the effect of exposure on the terminal event that is mediated through the intermediate event. We propose nonparametric and semiparametric methods, both accounting for left truncation. The nonparametric estimator can be viewed as a model-free time-varying Nelson-Aalen estimator that is robust to model misspecification. The semiparametric estimator calculated with the Cox proportional hazards model enjoys flexibility in adjusting for potential confounders as covariates. The asymptotic properties for both estimators, including uniform consistency and weak convergence, were established using the martingale theorem and functional delta method. The finite sample performance of the proposed estimators was evaluated through extensive numerical studies that investigated the influences of left truncation, confounding, and sample size. The utility of the proposed methods was illustrated using a hepatitis study.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241313291"},"PeriodicalIF":1.6,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143693419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillermo Briseño Sanchez, Nadja Klein, Hannah Klinkhammer, Andreas Mayr
{"title":"Boosting distributional copula regression for bivariate binary, discrete and mixed responses.","authors":"Guillermo Briseño Sanchez, Nadja Klein, Hannah Klinkhammer, Andreas Mayr","doi":"10.1177/09622802241313294","DOIUrl":"https://doi.org/10.1177/09622802241313294","url":null,"abstract":"<p><p>Motivated by challenges in the analysis of biomedical data and observational studies, we develop statistical boosting for the general class of bivariate distributional copula regression with arbitrary marginal distributions, which is suited for binary, count, continuous or mixed outcomes. To arrive at a flexible model for the entire conditional distribution, not only the marginal distribution parameters but also the copula parameters are related to covariates through additive predictors. We suggest estimation by means of an adapted component-wise gradient boosting algorithm. A key benefit of boosting as opposed to classical likelihood or Bayesian estimation is the implicit data-driven variable selection mechanism as well as shrinkage. To the best of our knowledge, our implementation is the only one that combines a wide range of covariate effects, marginal distributions, copula functions, and implicit data-driven variable selection. We showcase the versatility of our approach to data from genetic epidemiology, healthcare utilization and childhood undernutrition. Our developments are implemented in the R package gamboostLSS, fostering transparent and reproducible research.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241313294"},"PeriodicalIF":1.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143674451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simultaneous variable selection and estimation for a partially linear Cox model.","authors":"Tingting Cai, Mengqi Xie, Tao Hu, Jianguo Sun","doi":"10.1177/09622802251322988","DOIUrl":"https://doi.org/10.1177/09622802251322988","url":null,"abstract":"<p><p>We consider simultaneous variable selection and estimation for a deep neural network-based partially linear Cox model and propose a novel penalized approach. In particular, a two-step iterative algorithm is developed with the use of the minimum information criterion to ensure sparse estimation. The proposed method circumvents the curse of dimensionality while facilitating the interpretability of linear covariate effects on survival, and the algorithm greatly reduces the computational burden by avoiding the need to select the optimal tuning parameters that is usually required by many other popular penalties. The convergence rate and asymptotic properties of the resulting estimator are established along with the consistency of variable selection. The performance of the procedure is demonstrated through extensive simulation studies and an application to a myeloma dataset.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251322988"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect estimation in the presence of a misclassified binary mediator.","authors":"Kimberly A Hochstedler Webb, Martin T Wells","doi":"10.1177/09622802251316970","DOIUrl":"https://doi.org/10.1177/09622802251316970","url":null,"abstract":"<p><p>Mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased. Misclassification is especially difficult to deal with when it is differential and when there are no gold standard labels available. Previous work has addressed this problem using a sensitivity analysis framework or by assuming that misclassification rates are known. We leverage a variable related to the misclassification mechanism to recover unbiased parameter estimates without using gold standard labels. The proposed methods require the reasonable assumption that the sum of the sensitivity and specificity is greater than 1. Three correction methods are presented: (1) An ordinary least squares correction for Normal outcome models, (2) a multi-step predictive value weighting method, and (3) a seamless expectation-maximization algorithm. We apply our misclassification correction strategies to investigate the mediating role of gestational hypertension on the association between maternal age and pre-term birth.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251316970"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interval estimation for the Youden index of a continuous diagnostic test with verification biased data.","authors":"Shirui Wang, Shuangfei Shi, Gengsheng Qin","doi":"10.1177/09622802251322989","DOIUrl":"https://doi.org/10.1177/09622802251322989","url":null,"abstract":"<p><p>In medical diagnostic studies, the Youden index plays a crucial role as a comprehensive measurement of the diagnostic test effectiveness, aiding in determining the optimal threshold values by maximizing the sum of sensitivity and specificity. However, in clinical practice, verification of true disease status might be partially missing and estimators based on partially validated subjects are usually biased. While verification bias-corrected estimation methods for the receiver operating characteristic curve have been widely studied, no such results have been specifically developed for the Youden index. In this paper, we propose bias-corrected interval estimation methods for the Youden index of a continuous test under the missing-at-random assumption. Based on four estimators (full imputation (FI), mean score imputation, inverse probability weighting, and the semiparametric efficient (SPE)) introduced by Alonzo and Pepe for handling verification bias, we develop multiple confidence intervals for the Youden index by applying bootstrap resampling and the method of variance estimates recovery (MOVER). Extensive simulation and real data studies show that when the disease model is correctly specified, MOVER-FI intervals yield better coverage probability. We also observe a tradeoff between methods when the verification proportion is low: Bootstrap approaches achieve higher accuracy, while MOVER approaches deliver greater precision. Remarkably, bootstrap-SPE interval exhibit appealing doubly robustness to model misspecification and perform adequately across almost all scenarios considered. Based on our findings, we recommend using the bootstrap-SPE intervals when the true disease model is unknown, and the MOVERws-FI interval if the true disease model can be well approximated.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251322989"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Priyanka Majumder, Siuli Mukhopadhyay, Bo Wang, Samiran Ghosh
{"title":"Sample size determinations in four-level longitudinal cluster randomized trials with random slope.","authors":"Priyanka Majumder, Siuli Mukhopadhyay, Bo Wang, Samiran Ghosh","doi":"10.1177/09622802251321996","DOIUrl":"https://doi.org/10.1177/09622802251321996","url":null,"abstract":"<p><p>Cluster or group randomized trials (CRTs) are increasingly used for behavioral as well as system-level interventions in many areas e.g. medicine, psychotherapy, policy, and health service research etc. Sample size determination for each level at the design stage is always a key requirement for any intervention trial including CRT. This work addresses this important issue for a four-level longitudinal CRT via detecting the intervention effect over time. A random intercept and random slope mixed effects linear regression model, including a time-by-intervention interaction is used for modeling. Closed-form expression of the power function and sample size for each level are determined to detect the interaction effect. Other than statistical power consideration, several other factors need attention while designing such CRTs. Optimal allocations accounting for subject attrition and cost constraints have been determined here. How sample size determination based on fixed and random slope models affects when between-subject variations in outcome are anticipated to be significant is also studied. The effect of ignoring cluster levels in a four-level CRT, which is often the case in the absence of an appropriate four-level model, is studied in details. Lastly, the proposed model is illustrated via a real-life human immunodeficiency virus prevention study conducted in the Bahamas.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251321996"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A connection between covariate adjustment and stratified randomization in randomized clinical trials.","authors":"Zhiwei Zhang","doi":"10.1177/09622802251324764","DOIUrl":"https://doi.org/10.1177/09622802251324764","url":null,"abstract":"<p><p>The statistical efficiency of randomized clinical trials can be improved by incorporating information from baseline covariates (i.e. pre-treatment patient characteristics). This can be done in the design stage using stratified (permutated block) randomization or in the analysis stage through covariate adjustment. This article makes a connection between covariate adjustment and stratified randomization in a general framework where all regular, asymptotically linear estimators are identified as augmented estimators. From a geometric perspective, covariate adjustment can be viewed as an attempt to approximate the optimal augmentation function, and stratified randomization improves a given approximation by moving it closer to the optimal augmentation function. The efficiency benefit of stratified randomization is asymptotically equivalent to attaching an optimal augmentation term based on the stratification factor. In designing a trial with stratified randomization, it is not essential to include all important covariates in the stratification, because their prognostic information can be incorporated through covariate adjustment. Under stratified randomization, adjusting for the stratification factor only in data analysis is not expected to improve efficiency, and the key to efficient estimation is incorporating prognostic information from all important covariates. These observations are confirmed in a simulation study and illustrated using real clinical trial data.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251324764"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Additive hazard causal model with a binary instrumental variable.","authors":"Zhisong Zhao, Huijuan Ma, Yong Zhou","doi":"10.1177/09622802251314288","DOIUrl":"https://doi.org/10.1177/09622802251314288","url":null,"abstract":"<p><p>The causal effect of a treatment on a censored outcome is often of fundamental interest and instrumental variable (IV) is a useful tool to tame bias caused by unmeasured confounding. The two-stage least squares commonly used for IV analysis in linear regression have been developed for regression analysis in a survival context under an additive hazards model. In this work, we study a distinctive binary IV framework with censored data where the causal treatment effect is quantified through an additive hazard model for compliers. Employing the special characteristics of the binary IV and adapting the principle of conditional score, we establish a weighted estimator with explicit form. We establish the asymptotic properties of the proposed estimators and provide plug-in and perturbed variance estimators. The finite sample performance of the proposed estimator is examined by extensive simulations. The proposed method is applied to a data set from the U.S. renal data system to compare dialytic modality-specific survival for end-stage renal disease patients.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251314288"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan A Race, Amy S Ruppert, Yvonne Efebera, Michael L Pennell
{"title":"Semi-parametric testing for ordinal treatment effects in time-to-event data via dynamic Dirichlet process mixtures of the inverse-Gaussian distribution.","authors":"Jonathan A Race, Amy S Ruppert, Yvonne Efebera, Michael L Pennell","doi":"10.1177/09622802251322986","DOIUrl":"https://doi.org/10.1177/09622802251322986","url":null,"abstract":"<p><p>Time-to-event data often violate the proportional hazards assumption under which the log-rank test is optimal. Such violations are especially common in the sphere of biological and medical data where heterogeneity due to unmeasured covariates or time varying effects are common. A variety of parametric survival models have been proposed in the literature which make more appropriate assumptions on the hazard function, at least for certain applications. One such model is derived from the first hitting time paradigm which assumes that a subject's event time is determined by a latent stochastic process reaching a threshold value. Several random effects specifications of the first hitting time model have also been proposed which allow for better modeling of data with unmeasured covariates. We propose a Bayesian model which loosens assumptions on the mixing distribution inherent in the random effects first hitting time models currently in use and we do so in a manner which is ideally suited for testing for effects of ordinal treatment variables. We demonstrate via a simulation study that the proposed model has better power than log-rank based methods in detecting ordinal treatment effects in the presence of nonproportional hazards. Additionally, we show that the proposed model is almost as powerful as log-rank based methods when the proportional hazards assumption holds. We also apply the proposed methodology to two biomedical data sets: a toxicity study in rodents and an observational study of cancer patients.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251322986"},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}