Apostolos Gkatzionis, Shaun R Seaman, Rachael A Hughes, Kate Tilling
{"title":"Relationship between collider bias and interactions on the log-additive scale.","authors":"Apostolos Gkatzionis, Shaun R Seaman, Rachael A Hughes, Kate Tilling","doi":"10.1177/09622802241306860","DOIUrl":"https://doi.org/10.1177/09622802241306860","url":null,"abstract":"<p><p>Collider bias occurs when conditioning on a common effect (collider) of two variables <math><mi>X</mi><mo>,</mo><mi>Y</mi></math>. In this article, we quantify the collider bias in the estimated association between exposure <math><mi>X</mi></math> and outcome <math><mi>Y</mi></math> induced by selecting on one value of a binary collider <math><mi>S</mi></math> of the exposure and the outcome. In the case of logistic regression, it is known that the magnitude of the collider bias in the exposure-outcome regression coefficient is proportional to the strength of interaction <math><msub><mi>δ</mi><mn>3</mn></msub></math> between <math><mi>X</mi></math> and <math><mi>Y</mi></math> in a log-additive model for the collider: <math><mrow><mi>P</mi></mrow><mo>(</mo><mi>S</mi><mo>=</mo><mn>1</mn><mrow><mo>|</mo></mrow><mi>X</mi><mo>,</mo><mi>Y</mi><mo>)</mo><mo>=</mo><mi>exp</mi><mspace></mspace><mrow><mo>{</mo><msub><mi>δ</mi><mn>0</mn></msub><mo>+</mo><msub><mi>δ</mi><mn>1</mn></msub><mi>X</mi><mo>+</mo><msub><mi>δ</mi><mn>2</mn></msub><mi>Y</mi><mo>+</mo><msub><mi>δ</mi><mn>3</mn></msub><mi>X</mi><mi>Y</mi><mo>}</mo></mrow></math>. We show that this result also holds under a linear or Poisson regression model for the exposure-outcome association. We then illustrate numerically that even if a log-additive model with interactions is not the true model for the collider, the interaction term in such a model is still informative about the magnitude of collider bias. Finally, we discuss the implications of these findings for methods that attempt to adjust for collider bias, such as inverse probability weighting which is often implemented without including interactions between variables in the weighting model.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241306860"},"PeriodicalIF":1.6,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143537748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Can Xie, Xuelin Huang, Ruosha Li, Yu Shen, Nicholas J Short, Kapil N Bhalla
{"title":"A new cure model accounting for longitudinal data and flexible patterns of hazard ratios over time.","authors":"Can Xie, Xuelin Huang, Ruosha Li, Yu Shen, Nicholas J Short, Kapil N Bhalla","doi":"10.1177/09622802251320793","DOIUrl":"10.1177/09622802251320793","url":null,"abstract":"<p><p>With the advancement of medical treatments, many historically incurable diseases have become curable. An accurate estimation of the cure rates is of great interest. When there are no clear biomarker indicators for cure, the estimation of cure rate is intertwined with and influenced by the specification of hazard functions for uncured patients. Consequently, the commonly used proportional hazards (PH) assumption, when violated, may lead to biased cure rate estimation. Meanwhile, longitudinal biomarker measurements for individual patients are usually available. To accommodate non-PH functions and incorporate individual longitudinal biomarker trajectories, we propose a new joint model for cure, survival, and longitudinal data, with hazard ratios between different covariate subgroups flexibly varying over time. The proposed joint model has individual random effects shared between its longitudinal and cure-survival submodels. The regression parameters are estimated by maximization of the non-parametric likelihood via the Monte Carlo expectation-maximization algorithm. The standard error estimation applies a jackknife resampling method. In simulation studies, we consider crossing and non-crossing survival curves, and the proposed model provides unbiased estimates for the cure rates. Our proposed joint cure model is illustrated via a study of chronic myeloid leukemia.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251320793"},"PeriodicalIF":1.6,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143524483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacong Du, Youfei Yu, Min Zhang, Zhenke Wu, Andrew M Ryan, Bhramar Mukherjee
{"title":"Outcome adaptive propensity score methods for handling censoring and high-dimensionality: Application to insurance claims.","authors":"Jiacong Du, Youfei Yu, Min Zhang, Zhenke Wu, Andrew M Ryan, Bhramar Mukherjee","doi":"10.1177/09622802241306856","DOIUrl":"https://doi.org/10.1177/09622802241306856","url":null,"abstract":"<p><p>Propensity scores are commonly used to reduce the confounding bias in non-randomized observational studies for estimating the average treatment effect. An important assumption underlying this approach is that all confounders that are associated with both the treatment and the outcome of interest are measured and included in the propensity score model. In the absence of strong prior knowledge about potential confounders, researchers may agnostically want to adjust for a high-dimensional set of pre-treatment variables. As such, variable selection procedure is needed for propensity score estimation. In addition, studies show that including variables related to treatment only in the propensity score model may inflate the variance of the treatment effect estimators, while including variables that are predictive of only the outcome can improve efficiency. In this article, we propose to incorporate outcome-covariate relationship in the propensity score model by including the predicted binary outcome probability as a covariate. Our approach can be easily adapted to an ensemble of variable selection methods, including regularization methods and modern machine-learning tools based on classification and regression trees. We evaluate our method to estimate the treatment effects on a binary outcome, which is possibly censored, across multiple treatment groups. Simulation studies indicate that incorporating outcome probability for estimating the propensity scores can improve statistical efficiency and protect against model misspecification. The proposed methods are applied to a cohort of advanced-stage prostate cancer patients identified from a private insurance claims database for comparing the adverse effects of four commonly used drugs for treating castration-resistant prostate cancer.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241306856"},"PeriodicalIF":1.6,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143516792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized framework for identifying meaningful heterogenous treatment effects in observational studies: A parametric data-adaptive G-computation approach.","authors":"Roch A Nianogo, Stephen O'Neill, Kosuke Inoue","doi":"10.1177/09622802251316969","DOIUrl":"https://doi.org/10.1177/09622802251316969","url":null,"abstract":"<p><p>There has been a renewed interest in identifying heterogenous treatment effects (HTEs) to guide personalized medicine. The objective was to illustrate the use of a step-by-step transparent parametric data-adaptive approach (the generalized HTE approach) based on the G-computation algorithm to detect heterogenous subgroups and estimate meaningful conditional average treatment effects (CATE). The following seven steps implement the generalized HTE approach: Step 1: Select variables that satisfy the backdoor criterion and potential effect modifiers; Step 2: Specify a flexible saturated model including potential confounders and effect modifiers; Step 3: Apply a selection method to reduce overfitting; Step 4: Predict potential outcomes under treatment and no treatment; Step 5: Contrast the potential outcomes for each individual; Step 6: Fit cluster modeling to identify potential effect modifiers; Step 7: Estimate subgroup CATEs. We illustrated the use of this approach using simulated and real data. Our generalized HTE approach successfully identified HTEs and subgroups defined by all effect modifiers using simulated and real data. Our study illustrates that it is feasible to use a step-by-step parametric and transparent data-adaptive approach to detect effect modifiers and identify meaningful HTEs in an observational setting. This approach should be more appealing to epidemiologists interested in explanation.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251316969"},"PeriodicalIF":1.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143493492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extension of Fisher's least significant difference method to multi-armed group-sequential response-adaptive designs.","authors":"Wenyu Liu, D Stephen Coad","doi":"10.1177/09622802251319896","DOIUrl":"https://doi.org/10.1177/09622802251319896","url":null,"abstract":"<p><p>Multi-armed multi-stage designs evaluate experimental treatments using a control arm at interim analyses. Incorporating response-adaptive randomisation in these designs allows early stopping, faster treatment selection and more patients to be assigned to the more promising treatments. Existing frequentist multi-armed multi-stage designs demonstrate that the family-wise error rate is strongly controlled, but they may be too conservative and lack power when the experimental treatments are very different therapies rather than doses of the same drug. Moreover, the designs use a fixed allocation ratio. In this article, Fisher's least significant difference method extended to group-sequential response-adaptive designs is investigated. It is shown mathematically that the information time continues after dropping inferior arms, and hence the error-spending approach can be used to control the family-wise error rate. Two optimal allocations were considered. One ensures efficient estimation of the treatment effects and the other maximises the power subject to a fixed total sample size. Operating characteristics of the group-sequential response-adaptive design for normal and censored survival outcomes based on simulation and redesigning the NeoSphere trial were compared with those of a fixed-sample design. Results show that the adaptive design attains efficient and ethical advantages, and that the family-wise error rate is well controlled.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251319896"},"PeriodicalIF":1.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143493488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kelsey L Grantham, Andrew B Forbes, Richard Hooper, Jessica Kasza
{"title":"The relative efficiency of staircase and stepped wedge cluster randomised trial designs.","authors":"Kelsey L Grantham, Andrew B Forbes, Richard Hooper, Jessica Kasza","doi":"10.1177/09622802251317613","DOIUrl":"https://doi.org/10.1177/09622802251317613","url":null,"abstract":"<p><p>The stepped wedge design is an appealing longitudinal cluster randomised trial design. However, it places a large burden on participating clusters by requiring all clusters to collect data in all periods of the trial. The staircase design may be a desirable alternative: treatment sequences consist of a limited number of measurement periods before and after the implementation of the intervention. In this article, we explore the relative efficiency of the stepped wedge design to several variants of the 'basic staircase' design, which has one control followed by one intervention period in each sequence. We model outcomes using linear mixed models and consider a sampling scheme where each participant is measured once. We first consider a basic staircase design embedded within the stepped wedge design, then basic staircase designs with either more clusters or larger cluster-period sizes, with the same total number of participants and with fewer total participants than the stepped wedge design. The relative efficiency of the designs depends on the intracluster correlation structure, correlation parameters and the trial configuration, including the number of sequences and cluster-period size. For a wide range of realistic trial settings, a basic staircase design will deliver greater statistical power than a stepped wedge design with the same number of participants, and in some cases, with even fewer total participants.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251317613"},"PeriodicalIF":1.6,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143433771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Long-term Dagum-power variance function frailty regression model: Application in health studies.","authors":"Agatha Sacramento Rodrigues, Patrick Borges","doi":"10.1177/09622802241304113","DOIUrl":"https://doi.org/10.1177/09622802241304113","url":null,"abstract":"<p><p>Survival models with cure fractions, known as long-term survival models, are widely used in epidemiology to account for both immune and susceptible patients regarding a failure event. In such studies, it is also necessary to estimate unobservable heterogeneity caused by unmeasured prognostic factors. Moreover, the hazard function may exhibit a non-monotonic shape, specifically, an unimodal hazard function. In this article, we propose a long-term survival model based on a defective version of the Dagum distribution, incorporating a power variance function frailty term to account for unobservable heterogeneity. This model accommodates survival data with cure fractions and non-monotonic hazard functions. The distribution is reparameterized in terms of the cure fraction, with covariates linked via a logit link, allowing for direct interpretation of covariate effects on the cure fraction-an uncommon feature in defective approaches. We present maximum likelihood estimation for model parameters, assess performance through Monte Carlo simulations, and illustrate the model's applicability using two health-related datasets: severe COVID-19 in pregnant and postpartum women and patients with malignant skin neoplasms.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241304113"},"PeriodicalIF":1.6,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Jointly assessing multiple endpoints in pilot and feasibility studies.","authors":"Robert N Montgomery, Amy E Bodde, Eric D Vidoni","doi":"10.1177/09622802241311219","DOIUrl":"https://doi.org/10.1177/09622802241311219","url":null,"abstract":"<p><p>Pilot and feasibility studies are routinely used to determine whether a definitive trial should be pursued; however, the methodologies used to assess feasibility endpoints are often basic and are rarely informed by the requirements of the planned future trial. We propose a new method for analyzing feasibility outcomes which can incorporate relationships between endpoints, utilize a preliminary study design for a future trial and allow for multiple types of feasibility endpoints. The approach specifies a Joint Feasibility Space (JFS) which is the combination of feasibility outcomes that would render a future trial feasible. We estimate the probability of being in the JFS using Bayesian methods and use simulation to create a decision rule based on frequentist operating characteristics. We compare our approach to other general-purpose methods in the literature with simulation and show that our approach has approximately the same performance when analyzing a single feasibility endpoint but is more efficient with more than one endpoint. Feasibility endpoints should be the focus of pilot and feasibility studies. The analyses of these endpoints deserve more attention than they are given, and we have provided a new, effective method their assessment.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241311219"},"PeriodicalIF":1.6,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust propensity score estimation via loss function calibration.","authors":"Yimeng Shang, Yu-Han Chiu, Lan Kong","doi":"10.1177/09622802241308709","DOIUrl":"10.1177/09622802241308709","url":null,"abstract":"<p><p>Propensity score estimation is often used as a preliminary step to estimate the average treatment effect with observational data. Nevertheless, misspecification of propensity score models undermines the validity of effect estimates in subsequent analyses. Prediction-based machine learning algorithms are increasingly used to estimate propensity scores to allow for more complex relationships between covariates. However, these approaches may not necessarily achieve covariates balancing. We propose a calibration-based method to better incorporate covariate balance properties in a general modeling framework. Specifically, we calibrate the loss function by adding a covariate imbalance penalty to standard parametric (e.g. logistic regressions) or machine learning models (e.g. neural networks). Our approach may mitigate the impact of model misspecification by explicitly taking into account the covariate balance in the propensity score estimation process. The empirical results show that the proposed method is robust to propensity score model misspecification. The integration of loss function calibration improves the balance of covariates and reduces the root-mean-square error of causal effect estimates. When the propensity score model is misspecified, the neural-network-based model yields the best estimator with less bias and smaller variance as compared to other methods considered.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241308709"},"PeriodicalIF":1.6,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143411114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen Wade, Peter Sarich, Pavla Vaneckova, Silvia Behar-Harpaz, Preston J Ngo, Paul B Grogan, Sonya Cressman, Coral E Gartner, John M Murray, Tony Blakely, Emily Banks, Martin C Tammemagi, Karen Canfell, Marianne F Weber, Michael Caruana
{"title":"Using Bayesian evidence synthesis to quantify uncertainty in population trends in smoking behaviour.","authors":"Stephen Wade, Peter Sarich, Pavla Vaneckova, Silvia Behar-Harpaz, Preston J Ngo, Paul B Grogan, Sonya Cressman, Coral E Gartner, John M Murray, Tony Blakely, Emily Banks, Martin C Tammemagi, Karen Canfell, Marianne F Weber, Michael Caruana","doi":"10.1177/09622802241310326","DOIUrl":"https://doi.org/10.1177/09622802241310326","url":null,"abstract":"<p><p>Simulation models of smoking behaviour provide vital forecasts of exposure to inform policy targets, estimates of the burden of disease, and impacts of tobacco control interventions. A key element of useful model-based forecasts is a clear picture of uncertainty due to the data used to inform the model, however, assessment of this parameter uncertainty is incomplete in almost all tobacco control models. As a remedy, we demonstrate a Bayesian approach to model calibration that quantifies parameter uncertainty. With a model calibrated to Australian data, we observed that the smoking cessation rate in Australia has increased with calendar year since the late 20th century, and in 2016 people who smoked would quit at a rate of 4.7 quit-events per 100 person-years (90% equal-tailed interval (ETI): 4.5-4.9). We found that those who quit smoking before age 30 years switched to reporting that they never smoked at a rate of approximately 2% annually (90% ETI: 1.9-2.2%). The Bayesian approach demonstrated here can be used as a blueprint to model other population behaviours that are challenging to measure directly, and to provide a clearer picture of uncertainty to decision-makers.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241310326"},"PeriodicalIF":1.6,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}