Lars L J van der Burg, Stefan Böhringer, Jonathan W Bartlett, Tjalling Bosse, Nanda Horeweg, Liesbeth C de Wreede, Hein Putter
{"title":"Analyzing Coarsened and Missing Data by Imputation Methods.","authors":"Lars L J van der Burg, Stefan Böhringer, Jonathan W Bartlett, Tjalling Bosse, Nanda Horeweg, Liesbeth C de Wreede, Hein Putter","doi":"10.1002/sim.70032","DOIUrl":"10.1002/sim.70032","url":null,"abstract":"<p><p>In various missing data problems, values are not entirely missing, but are coarsened. For coarsened observations, instead of observing the true value, a subset of values - strictly smaller than the full sample space of the variable - is observed to which the true value belongs. In our motivating example for patients with endometrial carcinoma, the degree of lymphovascular space invasion (LVSI) can be either absent, focally present, or substantially present. For a subset of individuals, however, LVSI is reported as being present, which includes both non-absent options. In the analysis of such a dataset, difficulties arise when coarsened observations are to be used in an imputation procedure. To our knowledge, no clear-cut method has been described in the literature on how to handle an observed subset of values, and treating them as entirely missing could lead to biased estimates. Therefore, in this paper, we evaluated the best strategy to deal with coarsened and missing data in multiple imputation. We tested a number of plausible ad hoc approaches, possibly already in use by statisticians. Additionally, we propose a principled approach to this problem, consisting of an adaptation of the SMC-FCS algorithm (SMC-FCS <math> <semantics> <mrow><msub><mo> </mo> <mrow><mtext>CoCo</mtext></mrow> </msub> </mrow> <annotation>$$ {}_{mathrm{CoCo}} $$</annotation></semantics> </math> : Coarsening compatible), that ensures that imputed values adhere to the coarsening information. These methods were compared in a simulation study. This comparison shows that methods that prevent imputations of incompatible values, like the SMC-FCS <math> <semantics> <mrow><msub><mo> </mo> <mrow><mtext>CoCo</mtext></mrow> </msub> </mrow> <annotation>$$ {}_{mathrm{CoCo}} $$</annotation></semantics> </math> method, perform consistently better in terms of a lower bias and RMSE, and achieve better coverage than methods that ignore coarsening or handle it in a more naïve way. The analysis of the motivating example shows that the way the coarsening information is handled can matter substantially, leading to different conclusions across methods. Overall, our proposed SMC-FCS <math> <semantics> <mrow><msub><mo> </mo> <mrow><mtext>CoCo</mtext></mrow> </msub> </mrow> <annotation>$$ {}_{mathrm{CoCo}} $$</annotation></semantics> </math> method outperforms other methods in handling coarsened data, requires limited additional computation cost and is easily extendable to other scenarios.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70032"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11881681/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143557867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angus C Jennings, Mark J Rutherford, Paul C Lambert
{"title":"A Spline-Based Approach to Smoothly Constrain Hazard Ratios With a View to Apply Treatment Effect Waning.","authors":"Angus C Jennings, Mark J Rutherford, Paul C Lambert","doi":"10.1002/sim.70035","DOIUrl":"10.1002/sim.70035","url":null,"abstract":"<p><strong>Objectives: </strong>To describe and assess, via simulation, a constraint-based spline approach to implement smooth hazard ratio (HR) waning in time-to-event analyses.</p><p><strong>Methods: </strong>A common consideration when extrapolating survival functions to evaluate the long-term performance of a novel intervention is scenarios where the beneficial effect of an intervention eventually disappears (treatment effect waning). One approach to relaxing the proportional hazards assumption for a treatment effect is to model it as a function of the timescale, with a spline function offering a flexible approach. We consider the constraint of coefficients of spline variables to 0 during estimation, leading to log-treatment effects that are constrained to 0 (HR = 1) from a given time-point: enforcing treatment efficacy waning. An example is reported. Datasets were simulated under a variety of scenarios and analyzed with treatment effect waning assumptions under various modeling choices. Bias in mean survival time difference, given fully observed waning or fully censored waning, was assessed and constrained HR estimates were visualized.</p><p><strong>Results: </strong>Given full waning, biases were small unless constraints directly contradicted truths. When waning was extrapolated, akin to real-life practice, biases over observed periods were minimized through the inclusion of a knot at the 95th percentile. The rate at which the HR waned slowed as the upper boundary knot/constraint was placed later, inducing less conservative treatment effect waning assumptions.</p><p><strong>Conclusion: </strong>An alternative approach to modeling smooth treatment efficacy waning is demonstrated, enabling HR conditioning and marginal RMST calculation in a single framework, along with applications of the method beyond this use.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70035"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891414/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143586730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maurilio Gutzeit, Johannes Rauh, Maximilian Kähler, Jona Cederbaum
{"title":"Modelling Volume-Outcome Relationships in Health Care.","authors":"Maurilio Gutzeit, Johannes Rauh, Maximilian Kähler, Jona Cederbaum","doi":"10.1002/sim.10339","DOIUrl":"10.1002/sim.10339","url":null,"abstract":"<p><p>Despite the ongoing strong interest in associations between quality of care and the volume of health care providers, a unified statistical framework for analyzing them is missing, and many studies suffer from poor statistical modelling choices. We propose a flexible, additive mixed model for studying volume-outcome associations in health care that takes into account individual patient characteristics as well as provider-specific effects through a hierarchical approach. More specifically, we treat volume as a continuous variable, and its effect on the considered outcome is modeled as a smooth function. We take account of different case-mixes by including patient-specific risk factors and clustering on the provider level through random intercepts. This strategy enables us to extract a smooth volume effect as well as volume-independent provider effects. These two quantities can be compared directly in terms of their magnitude, which gives insight into the sources of variability of quality of care. Based on a causal DAG, we derive conditions under which the volume-effect can be interpreted as a causal effect. The paper provides confidence sets for each of the estimated quantities relying on joint estimation of all effects and parameters. Our approach is illustrated through simulation studies and an application to German health care data about mortality of very low birth weight infants.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e10339"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143557932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Seamless Design for the Combination of a Case-Control and a Cohort Diagnostic Accuracy Study.","authors":"Eric Bibiza-Freiwald, Werner Vach, Antonia Zapf","doi":"10.1002/sim.70016","DOIUrl":"10.1002/sim.70016","url":null,"abstract":"<p><p>In determining the accuracy of a new diagnostic test, often two steps are performed. In the first step, a case-control study is performed as an efficient but potentially biased design. In a second step, a population-based cohort study is performed as an unbiased but less efficient design. In order to accelerate diagnostic research, it has recently been suggested to combine the two designs in one seamless design. In this article, we present a more in-depth description of this idea. The seamless diagnostic accuracy study design is formally introduced by comparison with the traditional pathway, and the basic design decisions are discussed: A stopping rule and a stopping time. An appealing feature of the design is the possibility to ignore the seamless design in the final analysis, although part of the data is used already in an interim analysis. The justification for this strategy is provided by a large-scale simulation study. The simulation study suggests also that the risk of a loss of power due to using a seamless design can be limited by a reasonable choice of the futility boundaries, defining the stopping rule. We conclude that the seamless diagnostic accuracy study design seems to be ready to use. It promises to accelerate diagnostic research, in particular if population-based cohort studies can be started without great efforts and if the reference standard can be evaluated with little delay.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70016"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11881794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143557861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ann Marie K Weideman, Kevin J Anstrom, Gary G Koch, Xianming Tan
{"title":"Preservation of Type I Error for Partially-Unblinded Sample Size Re-Estimation.","authors":"Ann Marie K Weideman, Kevin J Anstrom, Gary G Koch, Xianming Tan","doi":"10.1002/sim.70030","DOIUrl":"10.1002/sim.70030","url":null,"abstract":"<p><p>Sample size re-estimation (SSR) at an interim analysis allows for adjustments based on accrued data. Existing strategies rely on either blinded or unblinded methods to inform such adjustments and, ideally, perform these adjustments in a way that preserves Type I error at the nominal level. Here, we propose an approach that uses partially-unblinded methods for SSR for both binary and continuous endpoints. Although this approach has operational unblinding, its partial use of the unblinded information for SSR does not include the interim effect size, hence the term 'partially-unblinded.' Through proof-of-concept and simulation studies, we demonstrate that these adjustments can be made without compromising the Type I error rate. We also investigate different mathematical expressions for SSR under different variance scenarios: homogeneity, heterogeneity, and a combination of both. Of particular interest is the third form of dual variance, for which we provide additional clarifications for binary outcomes and derive an analogous form for continuous outcomes. We show that the corresponding mathematical expressions for the dual variance method are a compromise between those for variance homogeneity and heterogeneity, resulting in sample size estimates that are bounded between those produced by the other expressions, and extend their applicability to adaptive trial design.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70030"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tetiana Gorbach, James R Carpenter, Chris Frost, Maria Josefsson, Jennifer Nicholas, Lars Nyberg
{"title":"Pattern Mixture Sensitivity Analyses via Multiple Imputations for Non-Ignorable Dropout in Joint Modeling of Cognition and Risk of Dementia.","authors":"Tetiana Gorbach, James R Carpenter, Chris Frost, Maria Josefsson, Jennifer Nicholas, Lars Nyberg","doi":"10.1002/sim.70040","DOIUrl":"10.1002/sim.70040","url":null,"abstract":"<p><p>Motivated by the Swedish Betula study, we consider the joint modeling of longitudinal memory assessments and the hazard of dementia. In the Betula data, the time-to-dementia onset or its absence is available for all participants, while some memory measurements are missing. In longitudinal studies of aging, one cannot rule out the possibility of dropout due to health issues resulting in missing not at random longitudinal measurements. We, therefore, propose a pattern-mixture sensitivity analysis for missing not-at-random data in the joint modeling framework. The sensitivity analysis is implemented via multiple imputation as follows: (i) multiply impute missing not at random longitudinal measurements under a set of plausible pattern-mixture imputation models that allow for acceleration of memory decline after dropout, (ii) fit the joint model to each imputed longitudinal memory and time-to-dementia dataset, and (iii) combine the results of step (ii). Our work illustrates that sensitivity analyses via multiple imputations are an accessible, pragmatic method to evaluate the consequences of missing not at-random data on inference and prediction. This flexible approach can accommodate a range of models for the longitudinal and event-time processes. In particular, the pattern-mixture modeling approach provides an accessible way to frame plausible missing not at random assumptions for different missing data patterns. Applying our approach to the Betula study shows that worse memory levels and steeper memory decline were associated with a higher risk of dementia for all considered scenarios.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70040"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11905689/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overcoming Model Uncertainty - How Equivalence Tests Can Benefit From Model Averaging.","authors":"Niklas Hagemann, Kathrin Möllenhoff","doi":"10.1002/sim.10309","DOIUrl":"10.1002/sim.10309","url":null,"abstract":"<p><p>A common problem in numerous research areas, particularly in clinical trials, is to test whether the effect of an explanatory variable on an outcome variable is equivalent across different groups. In practice, these tests are frequently used to compare the effect between patient groups, for example, based on gender, age, or treatments. Equivalence is usually assessed by testing whether the difference between the groups does not exceed a pre-specified equivalence threshold. Classical approaches are based on testing the equivalence of single quantities, for example, the mean, the area under the curve or other values of interest. However, when differences depending on a particular covariate are observed, these approaches can turn out to be not very accurate. Instead, whole regression curves over the entire covariate range, describing for instance the time window or a dose range, are considered and tests are based on a suitable distance measure of two such curves, as, for example, the maximum absolute distance between them. In this regard, a key assumption is that the true underlying regression models are known, which is rarely the case in practice. However, misspecification can lead to severe problems as inflated type I errors or, on the other hand, conservative test procedures. In this paper, we propose a solution to this problem by introducing a flexible extension of such an equivalence test using model averaging in order to overcome this assumption and making the test applicable under model uncertainty. Precisely, we introduce model averaging based on smooth Bayesian information criterion weights and we propose a testing procedure which makes use of the duality between confidence intervals and hypothesis testing. We demonstrate the validity of our approach by means of a simulation study and illustrate its practical relevance considering a time-response case study with toxicological gene expression data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e10309"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11923417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143664445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Weight Selection for Time-To-Event Data Under Non-Proportional Hazards.","authors":"Moritz Fabian Danzer, Ina Dormuth","doi":"10.1002/sim.70045","DOIUrl":"10.1002/sim.70045","url":null,"abstract":"<p><p>When planning a clinical trial for a time-to-event endpoint, we require an estimated effect size and need to consider the type of effect. Usually, an effect of proportional hazards is assumed with the hazard ratio as the corresponding effect measure. Thus, the standard procedure for survival data is generally based on a single-stage log-rank test. Knowing that the assumption of proportional hazards is often violated and sufficient knowledge to derive reasonable effect sizes is usually unavailable, such an approach is relatively rigid. We introduce a more flexible procedure by combining two methods designed to be more robust in case we have little to no prior knowledge. First, we employ a more flexible adaptive multi-stage design instead of a single-stage design. Second, we apply combination-type tests in the first stage of our suggested procedure to benefit from their robustness under uncertainty about the deviation pattern. We can then use the data collected during this period to choose a more specific single-weighted log-rank test for the subsequent stages. In this step, we employ Royston-Parmar spline models to extrapolate the survival curves to make a reasonable decision. Based on a real-world data example, we show that our approach can save a trial that would otherwise end with an inconclusive result. Additionally, our simulation studies demonstrate a sufficient power performance while maintaining more flexibility.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70045"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuan Wu, Ryan A Simmons, Baoshan Zhang, Jesse D Troy
{"title":"Group Sequential Test for Two-Sample Ordinal Outcome Measures.","authors":"Yuan Wu, Ryan A Simmons, Baoshan Zhang, Jesse D Troy","doi":"10.1002/sim.70053","DOIUrl":"10.1002/sim.70053","url":null,"abstract":"<p><p>Group sequential trials include interim monitoring points to potentially reach futility or efficacy decisions early. This approach to trial design can safeguard patients, provide efficacious treatments for patients early, and save money and time. Group sequential methods are well developed for bell-shaped continuous, binary, and time-to-event outcomes. In this paper, we propose a group sequential design using the Mann-Whitney-Wilcoxon test for general two-sample ordinal data. We establish that the proposed test statistic has asymptotic normality and that sequential statistics satisfy the assumptions of Brownian motion. We also include results of finite sample simulation studies that show our proposed approach has the advantage over existing methods for controlling Type I errors while maintaining power for small sample sizes. A real data set is used to illustrate the proposed method and a sample size calculation approach is proposed for designing new studies.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e70053"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11925493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143650868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hillary M Heiling, Naim U Rashid, Quefeng Li, Xianlu L Peng, Jen Jen Yeh, Joseph G Ibrahim
{"title":"Efficient Computation of High-Dimensional Penalized Piecewise Constant Hazard Random Effects Models.","authors":"Hillary M Heiling, Naim U Rashid, Quefeng Li, Xianlu L Peng, Jen Jen Yeh, Joseph G Ibrahim","doi":"10.1002/sim.10311","DOIUrl":"10.1002/sim.10311","url":null,"abstract":"<p><p>Identifying and characterizing relationships between treatments, exposures, or other covariates and time-to-event outcomes has great significance in a wide range of biomedical settings. In research areas such as multi-center clinical trials, recurrent events, and genetic studies, proportional hazard mixed effects models (PHMMs) are used to account for correlations observed in clusters within the data. In high dimensions, proper specification of the fixed and random effects within PHMMs is difficult and computationally complex. In this paper, we approximate the proportional hazards mixed effects model with a piecewise constant hazard mixed effects survival model. We estimate the model parameters using a modified Monte Carlo expectation conditional minimization (MCECM) algorithm, allowing us to perform variable selection on both the fixed and random effects simultaneously. We also incorporate a factor model decomposition of the random effects in order to more easily scale the variable selection method to larger dimensions. We demonstrate the utility of our method using simulations, and we apply our method to a multi-study pancreatic ductal adenocarcinoma gene expression dataset to select features important for survival.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 6","pages":"e10311"},"PeriodicalIF":1.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143586870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}