Apostolos Gkatzionis, Shaun R Seaman, Rachael A Hughes, Kate Tilling
{"title":"Relationship between collider bias and interactions on the log-additive scale.","authors":"Apostolos Gkatzionis, Shaun R Seaman, Rachael A Hughes, Kate Tilling","doi":"10.1177/09622802241306860","DOIUrl":"10.1177/09622802241306860","url":null,"abstract":"<p><p>Collider bias occurs when conditioning on a common effect (collider) of two variables <math><mi>X</mi><mo>,</mo><mi>Y</mi></math>. In this article, we quantify the collider bias in the estimated association between exposure <math><mi>X</mi></math> and outcome <math><mi>Y</mi></math> induced by selecting on one value of a binary collider <math><mi>S</mi></math> of the exposure and the outcome. In the case of logistic regression, it is known that the magnitude of the collider bias in the exposure-outcome regression coefficient is proportional to the strength of interaction <math><msub><mi>δ</mi><mn>3</mn></msub></math> between <math><mi>X</mi></math> and <math><mi>Y</mi></math> in a log-additive model for the collider: <math><mrow><mi>P</mi></mrow><mo>(</mo><mi>S</mi><mo>=</mo><mn>1</mn><mrow><mo>|</mo></mrow><mi>X</mi><mo>,</mo><mi>Y</mi><mo>)</mo><mo>=</mo><mi>exp</mi><mspace></mspace><mrow><mo>{</mo><msub><mi>δ</mi><mn>0</mn></msub><mo>+</mo><msub><mi>δ</mi><mn>1</mn></msub><mi>X</mi><mo>+</mo><msub><mi>δ</mi><mn>2</mn></msub><mi>Y</mi><mo>+</mo><msub><mi>δ</mi><mn>3</mn></msub><mi>X</mi><mi>Y</mi><mo>}</mo></mrow></math>. We show that this result also holds under a linear or Poisson regression model for the exposure-outcome association. We then illustrate numerically that even if a log-additive model with interactions is not the true model for the collider, the interaction term in such a model is still informative about the magnitude of collider bias. Finally, we discuss the implications of these findings for methods that attempt to adjust for collider bias, such as inverse probability weighting which is often implemented without including interactions between variables in the weighting model.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1063-1078"},"PeriodicalIF":1.6,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12209546/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143537748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ornella Moro, Inger Torhild Gram, Maja-Lisa Løchen, Marit B Veierød, Ana Maria Wägner, Giovanni Sebastiani
{"title":"Quantification of the influence of risk factors with application to cardiovascular diseases in subjects with type 1 diabetes.","authors":"Ornella Moro, Inger Torhild Gram, Maja-Lisa Løchen, Marit B Veierød, Ana Maria Wägner, Giovanni Sebastiani","doi":"10.1177/09622802251327680","DOIUrl":"10.1177/09622802251327680","url":null,"abstract":"<p><p>Future occurrence of a disease can be highly influenced by some specific risk factors. This work presents a comprehensive approach to quantify the event probability as a function of each separate risk factor by means of a parametric model. The proposed methodology is mainly described and applied here in the case of a linear model, but the non-linear case is also addressed. To improve estimation accuracy, three distinct methods are developed and their results are integrated. One of them is Bayesian, based on a non-informative prior. Each of the other two, uses aggregation of sample elements based on their factor values, which is optimized by means of a different specific criterion. For one of these two, optimization is performed by Simulated Annealing. The methodology presented is applicable across various diseases but here we quantify the risk for cardiovascular diseases in subjects with type 1 diabetes. The results obtained combining the three different methods show accurate estimates of cardiovascular risk variation rates for the factors considered. Furthermore, the detection of a biological activation phenomenon for one of the factors is also illustrated. To quantify the performances of the proposed methodology and to compare them with those from a known method used for this type of models, a large simulation study is done, whose results are illustrated here.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251327680"},"PeriodicalIF":1.6,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144111965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of global average treatment effect in National Heart, Lung, and Blood Institute (NHLBI) Growth and Health Study.","authors":"Lili Yue, Colin O Wu, Gaorong Li, Zhaohai Li","doi":"10.1177/09622802241313288","DOIUrl":"10.1177/09622802241313288","url":null,"abstract":"<p><p>We propose a procedure to estimate the \"time-specific average treatment effect\" and \"global average treatment effect\" for observational studies with outcomes and covariates repeatedly measured over time. This research is motivated by the National Heart, Lung and Blood Institute Growth and Health Study (NGHS), a longitudinal cohort study that aims to evaluate the influences of race and other risk factors on the levels of blood pressure for children and adolescents. As with most longitudinal cohort studies, we do not have a known propensity score model to further discuss the average treatment effects in the NGHS. To solve this problem, a nonparametric machine learning method, the generalized boosted models (GBMs), is used to estimate the propensity score. Based on the estimated propensity score, the \"time-specific average treatment effect\" can be obtained through the inverse probability weighting methods, then the \"global average treatment effect\" is also obtained. We apply the proposed GBM-based estimation method to the NGHS blood pressure data and demonstrate through a simulation study that the GBM-based estimation method is superior to the commonly used logistic regression-based method.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"956-967"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144034767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect estimation in the presence of a misclassified binary mediator.","authors":"Kimberly A Hochstedler Webb, Martin T Wells","doi":"10.1177/09622802251316970","DOIUrl":"10.1177/09622802251316970","url":null,"abstract":"<p><p>Mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased. Misclassification is especially difficult to deal with when it is differential and when there are no gold standard labels available. Previous work has addressed this problem using a sensitivity analysis framework or by assuming that misclassification rates are known. We leverage a variable related to the misclassification mechanism to recover unbiased parameter estimates without using gold standard labels. The proposed methods require the reasonable assumption that the sum of the sensitivity and specificity is greater than 1. Three correction methods are presented: (1) An ordinary least squares correction for Normal outcome models, (2) a multi-step predictive value weighting method, and (3) a seamless expectation-maximization algorithm. We apply our misclassification correction strategies to investigate the mediating role of gestational hypertension on the association between maternal age and pre-term birth.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1037-1059"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Bayesian bivariate spatial modeling of small area proportions with application to health survey data.","authors":"Hanjun Yu, Xinyi Xu, Lichao Yu","doi":"10.1177/09622802251316968","DOIUrl":"10.1177/09622802251316968","url":null,"abstract":"<p><p>In this article, we propose bivariate small area estimation methods for proportions based on the logit-normal mixed models with latent spatial dependence. We incorporate multivariate conditional autoregressive structures for the random effects under the hierarchical Bayesian modeling framework, and extend the methods to accommodate non-sampled regions. Posterior inference is obtained via adaptive Markov chain Monte Carlo algorithms. Extensive simulation studies are carried out to demonstrate the effectiveness of the proposed bivariate spatial models. The results suggest that the proposed methods are more efficient than the univariate and non-spatial methods in estimation and prediction, particularly when bivariate spatial dependence exists. Practical guidelines for model selection based on the simulation results are provided. We further illustrate the application of our methods by estimating the province-level heart disease rates and dyslipidemia rates among the middle-aged and elderly population in China's 31 mainland provinces in 2020.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1018-1036"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyu Zhang, Erich J Greene, Ondrej Blaha, Wei Wei
{"title":"Statistical considerations for evaluating treatment effect under various non-proportional hazard scenarios.","authors":"Xinyu Zhang, Erich J Greene, Ondrej Blaha, Wei Wei","doi":"10.1177/09622802241313297","DOIUrl":"10.1177/09622802241313297","url":null,"abstract":"<p><p>We conducted a systematic comparison of statistical methods used for the analysis of time-to-event outcomes under various proportional and non-proportional hazard (NPH) scenarios. Our study used data from recently published oncology trials to compare the Log-rank test, still by far the most widely used option, against some available alternatives, including the MaxCombo test, the Restricted Mean Survival Time difference test, the Generalized Gamma model and the Generalized F model. Power, type I error rate, and time-dependent bias with respect to the survival probability and median survival time were used to evaluate and compare the performance of these methods. In addition to the real data, we simulated three hypothetical scenarios with crossing hazards chosen so that the early and late effects \"cancel out\" and used them to evaluate the ability of the aforementioned methods to detect time-specific and overall treatment effects. We implemented novel metrics for assessing the time-dependent bias in treatment effect estimates to provide a more comprehensive evaluation in NPH scenarios. Recommendations under each NPH scenario are provided by examining the type I error rate, power, and time-dependent bias associated with each statistical approach.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"986-1000"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcel Wolbers, Mar Vázquez Rabuñal, Ke Li, Kaspar Rufibach, Daniel Sabanés Bové
{"title":"Using shrinkage methods to estimate treatment effects in overlapping subgroups in randomized clinical trials with a time-to-event endpoint.","authors":"Marcel Wolbers, Mar Vázquez Rabuñal, Ke Li, Kaspar Rufibach, Daniel Sabanés Bové","doi":"10.1177/09622802241313292","DOIUrl":"10.1177/09622802241313292","url":null,"abstract":"<p><p>In randomized controlled trials, forest plots are frequently used to investigate the homogeneity of treatment effect estimates in pre-defined subgroups. However, the interpretation of subgroup-specific treatment effect estimates requires great care due to the smaller sample size of subgroups and the large number of investigated subgroups. Bayesian shrinkage methods have been proposed to address these issues, but they often focus on disjoint subgroups while subgroups displayed in forest plots are overlapping, i.e., each subject appears in multiple subgroups. In our proposed approach, we first build a flexible Cox model based on all available observations, including treatment-by-subgroup interaction terms for all subgroups of interest. We explore penalized partial likelihood estimation with lasso or ridge penalties for interaction terms, and Bayesian estimation with a regularized horseshoe prior. In a second step, the Cox model is marginalized to obtain treatment effect estimates for all subgroups. We illustrate these methods using data from a randomized clinical trial in follicular lymphoma and evaluate their properties in a simulation study. In all simulation scenarios, the overall mean-squared error is substantially smaller for penalized and shrinkage estimators compared to the standard subgroup-specific treatment effect estimator but leads to some bias for heterogeneous subgroups. A naive overall sample estimator also outperforms the standard subgroup-specific estimator in terms of the overall mean-squared error for all scenarios except for a scenario with substantial heterogeneity. We recommend that subgroup-specific estimators are routinely complemented by treatment effect estimators based on shrinkage methods. The proposed methods are implemented in the R package bonsaiforest.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"903-914"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143701487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillermo Briseño Sanchez, Nadja Klein, Hannah Klinkhammer, Andreas Mayr
{"title":"Boosting distributional copula regression for bivariate binary, discrete and mixed responses.","authors":"Guillermo Briseño Sanchez, Nadja Klein, Hannah Klinkhammer, Andreas Mayr","doi":"10.1177/09622802241313294","DOIUrl":"10.1177/09622802241313294","url":null,"abstract":"<p><p>Motivated by challenges in the analysis of biomedical data and observational studies, we develop statistical boosting for the general class of bivariate distributional copula regression with arbitrary marginal distributions, which is suited for binary, count, continuous or mixed outcomes. To arrive at a flexible model for the entire conditional distribution, not only the marginal distribution parameters but also the copula parameters are related to covariates through additive predictors. We suggest estimation by means of an adapted component-wise gradient boosting algorithm. A key benefit of boosting as opposed to classical likelihood or Bayesian estimation is the implicit data-driven variable selection mechanism as well as shrinkage. To the best of our knowledge, our implementation is the only one that combines a wide range of covariate effects, marginal distributions, copula functions, and implicit data-driven variable selection. We showcase the versatility of our approach to data from genetic epidemiology, healthcare utilization and childhood undernutrition. Our developments are implemented in the R package gamboostLSS, fostering transparent and reproducible research.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"887-902"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12177205/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143674451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Additive hazard causal model with a binary instrumental variable.","authors":"Zhisong Zhao, Huijuan Ma, Yong Zhou","doi":"10.1177/09622802251314288","DOIUrl":"10.1177/09622802251314288","url":null,"abstract":"<p><p>The causal effect of a treatment on a censored outcome is often of fundamental interest and instrumental variable (IV) is a useful tool to tame bias caused by unmeasured confounding. The two-stage least squares commonly used for IV analysis in linear regression have been developed for regression analysis in a survival context under an additive hazards model. In this work, we study a distinctive binary IV framework with censored data where the causal treatment effect is quantified through an additive hazard model for compliers. Employing the special characteristics of the binary IV and adapting the principle of conditional score, we establish a weighted estimator with explicit form. We establish the asymptotic properties of the proposed estimators and provide plug-in and perturbed variance estimators. The finite sample performance of the proposed estimator is examined by extensive simulations. The proposed method is applied to a data set from the U.S. renal data system to compare dialytic modality-specific survival for end-stage renal disease patients.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"867-886"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A James O'Malley, Yifan Zhao, Carly Bobak, Chuanling Qin, Erika L Moen, Daniel N Rockmore
{"title":"Methodology for supervised optimization of the construction of physician shared-patient networks.","authors":"A James O'Malley, Yifan Zhao, Carly Bobak, Chuanling Qin, Erika L Moen, Daniel N Rockmore","doi":"10.1177/09622802241313281","DOIUrl":"10.1177/09622802241313281","url":null,"abstract":"<p><p>There is growing use of shared-patient physician networks in health services research and practice, but minimal study of the consequences of decisions made in constructing them. To address this gap, we surveyed physician employees of a National Physician Organization (NPO) on their peer physician relationships. Using the physicians' survey nominations as ground truths, we evaluated the diagnostic accuracy of shared-patient edge-weights and the optimal construction of physician networks from sequences of patient-physician encounters. To further improve diagnostic accuracy, we optimized network construction with respect to the within-dyad difference and summation of edge-strength (two orthogonal measures), optimally combining them to form a final edge-weight. To achieve these goals, we develop statistical procedures to quantify the extent that directionality and other features of referral paths yield edge-weights with improved diagnostic properties. We also develop network models of the survey nominations incorporating directed (edge) and undirected (dyadic) shared-patient network measures as edge and dyad attributes to demonstrate that the measurement of the network as a whole is improved. Finally, we estimate the association of the physicians' centrality in the NPO shared-patient network (a sociocentric feature that cannot be evaluated for the partially-measured survey-based network) with their beliefs regarding physician peer-influence.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"938-955"},"PeriodicalIF":1.6,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12270385/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143753847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}