Katy E Morgan, Ian R White, Clémence Leyrat, Simon Stanworth, Brennan C Kahan
{"title":"Applying the Estimands Framework to Non-Inferiority Trials: Guidance on Choice of Hypothetical Estimands for Non-Adherence and Comparison of Estimation Methods.","authors":"Katy E Morgan, Ian R White, Clémence Leyrat, Simon Stanworth, Brennan C Kahan","doi":"10.1002/sim.10348","DOIUrl":"10.1002/sim.10348","url":null,"abstract":"<p><p>A common concern in non-inferiority (NI) trials is that non-adherence due, for example, to poor study conduct can make treatment arms artificially similar. Because intention-to-treat analyses can be anti-conservative in this situation, per-protocol analyses are sometimes recommended. However, such advice does not consider the estimands framework, nor the risk of bias from per-protocol analyses. We therefore sought to update the above guidance using the estimands framework, and compare estimators to improve on the performance of per-protocol analyses. We argue the main threat to validity of NI trials is the occurrence of \"trial-specific\" intercurrent events (IEs), that is, IEs which occur in a trial setting, but would not occur in practice. To guard against erroneous conclusions of non-inferiority, we suggest an estimand using a hypothetical strategy for trial-specific IEs should be employed, with handling of other non-trial-specific IEs chosen based on clinical considerations. We provide an overview of estimators that could be used to estimate a hypothetical estimand, including inverse probability weighting (IPW), and two instrumental variable approaches (one using an informative Bayesian prior on the effect of standard treatment, and one using a treatment-by-covariate interaction as an instrument). We compare them, using simulation in the setting of all-or-nothing compliance in two active treatment arms, and conclude both IPW and the instrumental variable method using a Bayesian prior are potentially useful approaches, with the choice between them depending on which assumptions are most plausible for a given trial.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e10348"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806244/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143374847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Individualized Time-Varying Nonparametric Model With an Application in Mobile Health.","authors":"Jenifer Rim, Qi Xu, Xiwei Tang, Yuqing Guo, Annie Qu","doi":"10.1002/sim.70005","DOIUrl":"10.1002/sim.70005","url":null,"abstract":"<p><p>Individualized modeling has become increasingly popular in recent years with its growing application in fields such as personalized medicine and mobile health studies. With rich longitudinal measurements, it is of great interest to model certain subject-specific time-varying covariate effects. In this paper, we propose an individualized time-varying nonparametric model by leveraging the subgroup information from the population. The proposed method approximates the time-varying covariate effect using nonparametric B-splines and aggregates the estimated nonparametric coefficients that share common patterns. Moreover, the proposed method can effectively handle various missing data patterns that frequently arise in mobile health data. Specifically, our method achieves subgrouping by flexibly accommodating varying dimensions of B-spline coefficients due to missingness. This capability sets it apart from other fusion-type approaches for subgrouping. The subgroup information can also potentially provide meaningful insight into the characteristics of subjects and assist in recommending an effective treatment or intervention. An efficient ADMM algorithm is developed for implementation. Our numerical studies and application to mobile health data on monitoring pregnant women's deep sleep and physical activities demonstrate that the proposed method achieves better performance compared to other existing methods.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70005"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143441906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dylan Spicker, Amir Nazemi, Joy Hutchinson, Paul Fieguth, Sharon Kirkpatrick, Michael Wallace, Kevin W Dodd
{"title":"Challenges for Predictive Modeling With Neural Network Techniques Using Error-Prone Dietary Intake Data.","authors":"Dylan Spicker, Amir Nazemi, Joy Hutchinson, Paul Fieguth, Sharon Kirkpatrick, Michael Wallace, Kevin W Dodd","doi":"10.1002/sim.70013","DOIUrl":"10.1002/sim.70013","url":null,"abstract":"<p><p>Dietary intake data are routinely drawn upon to explore diet-health relationships, and inform clinical practice and public health. However, these data are almost always subject to measurement error, distorting true diet-health relationships. Beyond measurement error, there are likely complex synergistic and sometimes antagonistic interactions between different dietary components, complicating the relationships between diet and health outcomes. Flexible models are required to capture the nuance that these complex interactions introduce. This complexity makes research on diet-health relationships an appealing candidate for the application of modern machine learning techniques, and in particular, neural networks. Neural networks are computational models that can capture highly complex, nonlinear relationships, so long as sufficient data are available. While these models have been applied in many domains, the impacts of measurement error on the performance of predictive modeling have not been widely investigated. In this work, we demonstrate the ways in which measurement error erodes the performance of neural networks and illustrate the care that is required for leveraging these models in the presence of error. We demonstrate the role that sample size and replicate measurements play in model performance, indicate a motivation for the investigation of transformations to additivity, and illustrate the caution required to prevent model overfitting. While the past performance of neural networks across various domains makes them an attractive candidate for examining diet-health relationships, our work demonstrates that substantial care and further methodological development are both required to observe increased predictive performance when applying these techniques compared to more traditional statistical procedures.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70013"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143374858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proportional Mean Residual Life Model With Varying Coefficients for Right Censored Data.","authors":"Bing Wang, Xinyuan Song, Qian Zhao","doi":"10.1002/sim.70008","DOIUrl":"10.1002/sim.70008","url":null,"abstract":"<p><p>The mean residual life provides the remaining life expectancy of a subject who has survived to a specific time point. This paper considers a proportional mean residual life model with varying coefficients, which allows one to explore the nonlinear interactions between some covariates and an exposure variable. In a semiparametric setting, we construct local estimating equations to obtain the varying coefficients and establish the asymptotic normality of the proposed estimators. Moreover, the weak convergence property for the local estimator of the baseline mean residual life function is developed. We conduct simulation studies to empirically examine the finite-sample performance of the proposed methods and apply the methodology to a real-life dataset on type 2 diabetic complications.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70008"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification and Estimation of the Average Causal Effects Under Dietary Substitution Strategies.","authors":"Yu-Han Chiu, Lan Wen","doi":"10.1002/sim.70007","DOIUrl":"10.1002/sim.70007","url":null,"abstract":"<p><p>The 2020-2025 Dietary Guidelines suggest that most people can improve their diet by making some changes to what they eat and drink. In many cases, these changes involve simple substitutions. For instance, the Dietary Guidelines recommend choosing chicken instead of processed red meat to reduce sodium intake and switching from refined grains to whole grains to increase dietary fiber intake. The question about such dietary substitution strategies seeks to estimate the average counterfactual outcome under a hypothetical intervention that replaces a food an individual would have consumed in the absence of intervention with a healthier substitute. In this work, we will show the conditions under which the average causal effects of substitution strategies can be non-parametrically identified, and provide efficient estimators for our proposed dietary substitution strategies. We evaluate the performance of our proposed methods via simulation studies and apply them to estimate the effect of substituting processed red meat with chicken on mortality, using data from the Nurses' Health Study.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70007"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11840885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Liu, Jie Bao, Xu Zhang, Chuan Shi, Catherine Liu, Rui Luo
{"title":"A Graph-Theoretic Approach to Detection of Parkinsonian Freezing of Gait From Videos.","authors":"Qi Liu, Jie Bao, Xu Zhang, Chuan Shi, Catherine Liu, Rui Luo","doi":"10.1002/sim.70020","DOIUrl":"10.1002/sim.70020","url":null,"abstract":"<p><p>Freezing of Gait (FOG) is a prevalent symptom in advanced Parkinson's Disease (PD), characterized by intermittent transitions between normal gait and freezing episodes. This study introduces a novel graph-theoretic approach to detect FOG from video data of PD patients. We construct a sequence of pose graphs that represent the spatial relations and temporal progression of a patient's posture over time. Each graph node corresponds to an estimated joint position, while the edges reflect the anatomical connections and their proximity. We propose a hypothesis testing procedure that deploys the Fréchet statistics to identify break points in time between regular gait and FOG episodes, where we model the central tendency and dispersion of the pose graphs in the presentation of graph Laplacian matrices by computing their Fréchet mean and variance. We implement binary segmentation and incremental computation in our algorithm for efficient calculation. The proposed framework is validated on two datasets, Kinect3D and AlphaPose, demonstrating its effectiveness in detecting FOG from video data. The proposed approach that extracts matrix features is distinct from the prevailing pixel-based deep learning methods. It provides a new perspective on feature extraction for FOG detection and potentially contributes to improved diagnosis and treatment of PD.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70020"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11841038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dir-GLM: A Bayesian GLM With Data-Driven Reference Distribution.","authors":"Entejar Alam, Peter Müller, Paul J Rathouz","doi":"10.1002/sim.10305","DOIUrl":"10.1002/sim.10305","url":null,"abstract":"<p><p>The recently developed semi-parametric generalized linear model (SPGLM) offers more flexibility as compared to the classical GLM by including the baseline or reference distribution of the response as an additional parameter in the model. However, some inference summaries are not easily generated under existing maximum-likelihood-based inference (GLDRM). This includes uncertainty in estimation for model-derived functionals such as exceedance probabilities. The latter are critical in a clinical diagnostic or decision-making setting. In this article, by placing a Dirichlet prior on the baseline distribution, we propose a Bayesian model-based approach for inference to address these important gaps. We establish consistency and asymptotic normality results for the implied canonical parameter. Simulation studies and an illustration with data from an aging research study confirm that the proposed method performs comparably or better in comparison with GLDRM. The proposed Bayesian framework is most attractive for inference with small sample training data or in sparse-data scenarios.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e10305"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11839158/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143441889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shu-Chin Lin, Sheng-Hsuan Lin, Tian Ge, Chia-Yen Chen, Yen-Feng Lin
{"title":"Causal Mediation Analysis: A Summary-Data Mendelian Randomization Approach.","authors":"Shu-Chin Lin, Sheng-Hsuan Lin, Tian Ge, Chia-Yen Chen, Yen-Feng Lin","doi":"10.1002/sim.10317","DOIUrl":"10.1002/sim.10317","url":null,"abstract":"<p><p>Summary-data Mendelian randomization (MR), a widely used approach in causal inference, has recently attracted attention for improving causal mediation analysis. Two existing methods corresponding to the difference method and product method of linear mediation analysis have been developed to perform MR-based mediation analysis using the inverse-variance weighted method (MR-IVW). Despite these developments, there is still a need for more rigorous, efficient, and precise MR-based mediation methodologies. In this study, we develop summary-data MR-based frameworks for causal mediation analysis. We improve the accuracy, statistical efficiency and robustness of the existing MR-based mediation analysis by implementing novel variance estimators for the mediation effects, deriving rigorous procedures for statistical inference, and accounting for widespread pleiotropic effects. Specifically, we propose Diff-IVW and Prod-IVW to improve upon the existing methods and provide the pleiotropy-robust methods (Diff-Egger, Diff-Median, Prod-Egger, and Prod-Median), adapted from MR-Egger and MR-Median, to enhance the robustness of the MR-based mediation analysis. We conduct comprehensive simulation studies to compare the existing and proposed methods. The results show that the proposed methods, Diff-IVW and Prod-IVW, improve statistical efficiency and type I error control over the existing approaches. Although all IVW-based methods suffer from directional pleiotropy biases, the median-based methods (Diff-Median and Prod-Median) can mitigate such biases. The differences among the methods can lead to discrepant statistical conclusions as demonstrated in real data applications. Based on our simulation results, we recommend the three proposed methods in practice: Diff-IVW, Prod-IVW, and Prod-Median, which are complementary under various scenarios.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e10317"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11799828/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143256814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeroen M Goedhart, Thomas Klausch, Jurriaan Janssen, Mark A van de Wiel
{"title":"Adaptive Use of Co-Data Through Empirical Bayes for Bayesian Additive Regression Trees.","authors":"Jeroen M Goedhart, Thomas Klausch, Jurriaan Janssen, Mark A van de Wiel","doi":"10.1002/sim.70004","DOIUrl":"10.1002/sim.70004","url":null,"abstract":"<p><p>For clinical prediction applications, we are often faced with small sample size data compared to the number of covariates. Such data pose problems for variable selection and prediction, especially when the covariate-response relationship is complicated. To address these challenges, we propose to incorporate external information on the covariates into Bayesian additive regression trees (BART), a sum-of-trees prediction model that utilizes priors on the tree parameters to prevent overfitting. To incorporate external information, an empirical Bayes (EB) framework is developed that estimates, assisted by a model, prior covariate weights in the BART model. The proposed EB framework enables the estimation of the other prior parameters of BART as well, rendering an appealing and computationally efficient alternative to cross-validation. We show that the method finds relevant covariates and that it improves prediction compared to default BART in simulations. If the covariate-response relationship is non-linear, the method benefits from the flexibility of BART to outperform regression-based learners. Finally, the benefit of incorporating external information is shown in an application to diffuse large B-cell lymphoma prognosis based on clinical covariates, gene mutations, DNA translocations, and DNA copy number data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70004"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834989/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143441876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yonatan Woodbridge, Micha Mandel, Yair Goldberg, Amit Huppert
{"title":"Estimating Mean Viral Load Trajectory From Intermittent Longitudinal Data and Unknown Time Origins.","authors":"Yonatan Woodbridge, Micha Mandel, Yair Goldberg, Amit Huppert","doi":"10.1002/sim.70033","DOIUrl":"10.1002/sim.70033","url":null,"abstract":"<p><p>Viral load (VL) in the respiratory tract is the leading proxy for assessing infectiousness potential. Understanding the dynamics of disease-related VL within the host is of great importance, as it helps to determine different policies and health recommendations. However, normally the VL is measured on individuals only once, in order to confirm infection, and furthermore, the infection date is unknown. It is therefore necessary to develop statistical approaches to estimate the typical VL trajectory. We show here that, under plausible parametric assumptions, two measures of VL on infected individuals can be used to accurately estimate the VL mean function. Specifically, we consider a discrete-time likelihood-based approach to modeling and estimating partial observed longitudinal samples. We study a multivariate normal model for a function of the VL that accounts for possible correlation between measurements within individuals. We derive an expectation-maximization (EM) algorithm which treats the unknown time origins and the missing measurements as latent variables. Our main motivation is the reconstruction of the daily mean VL, given measurements on patients whose VLs were measured multiple times on different days. Such data should and can be obtained at the beginning of a pandemic with the specific goal of estimating the VL dynamics. For demonstration purposes, the method is applied to SARS-Cov-2 cycle-threshold-value data collected in Israel.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 5","pages":"e70033"},"PeriodicalIF":1.8,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143493468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}