{"title":"Smoothed Estimation on Optimal Treatment Regime Under Semisupervised Setting in Randomized Trials","authors":"Xiaoqi Jiao, Mengjiao Peng, Yong Zhou","doi":"10.1002/bimj.70006","DOIUrl":"10.1002/bimj.70006","url":null,"abstract":"<div>\u0000 \u0000 <p>A treatment regime refers to the process of assigning the most suitable treatment to a patient based on their observed information. However, prevailing research on treatment regimes predominantly relies on labeled data, which may lead to the omission of valuable information contained within unlabeled data, such as historical records and healthcare databases. Current semisupervised works for deriving optimal treatment regimes either rely on model assumptions or struggle with high computational burdens for even moderate-dimensional covariates. To address this concern, we propose a semisupervised framework that operates within a model-free context to estimate the optimal treatment regime by leveraging the abundant unlabeled data. Our proposed approach encompasses three key steps. First, we employ a single-index model to achieve dimension reduction, followed by kernel regression to impute the missing outcomes in the unlabeled data. Second, we propose various forms of semisupervised value functions based on the imputed values, incorporating both labeled and unlabeled data components. Lastly, the optimal treatment regimes are derived by maximizing the semisupervised value functions. We establish the consistency and asymptotic normality of the estimators proposed in our framework. Furthermore, we introduce a perturbation resampling procedure to estimate the asymptotic variance. Simulations confirm the advantageous properties of incorporating unlabeled data in the estimation for optimal treatment regimes. A practical data example is also provided to illustrate the application of our methodology. This work is rooted in the framework of randomized trials, with additional discussions extending to observational studies.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simulating Data From Marginal Structural Models for a Survival Time Outcome","authors":"Shaun R. Seaman, Ruth H. Keogh","doi":"10.1002/bimj.70010","DOIUrl":"10.1002/bimj.70010","url":null,"abstract":"<p>Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, for example, inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here, we propose a method that overcomes these restrictions. The MSM can be, for example, a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conditional Variable Screening for Ultra-High Dimensional Longitudinal Data With Time Interactions","authors":"Andrea Bratsberg, Abhik Ghosh, Magne Thoresen","doi":"10.1002/bimj.70005","DOIUrl":"10.1002/bimj.70005","url":null,"abstract":"<p>In recent years, we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a moderately high dimension will now become computationally infeasible or unstable. Hence, there is a need for a prescreening of variables to reduce the dimension efficiently and accurately to a more moderate scale. There has been much work to develop such screening procedures for independent outcomes. However, much less work has been done for high-dimensional longitudinal data in which the observations can no longer be assumed to be independent. In addition, it is of interest to capture possible interactions between the genomic variable and time in many of these longitudinal studies. In this work, we propose a novel conditional screening procedure that ranks variables according to the likelihood value at the maximum likelihood estimates in a marginal linear mixed model, where the genomic variable and its interaction with time are included in the model. This is to our knowledge the first conditional screening approach for clustered data. We prove that this approach enjoys the sure screening property, and assess the finite sample performance of the method through simulations.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incompletely Observed Nonparametric Factorial Designs With Repeated Measurements: A Wild Bootstrap Approach","authors":"Lubna Amro, Frank Konietschke, Markus Pauly","doi":"10.1002/bimj.70008","DOIUrl":"10.1002/bimj.70008","url":null,"abstract":"<p>In many life science experiments or medical studies, subjects are repeatedly observed and measurements are collected in factorial designs with multivariate data. The analysis of such multivariate data is typically based on multivariate analysis of variance (MANOVA) or mixed models, requiring complete data, and certain assumption on the underlying parametric distribution such as continuity or a specific covariance structure, for example, compound symmetry. However, these methods are usually not applicable when discrete data or even ordered categorical data are present. In such cases, nonparametric rank-based methods that do not require stringent distributional assumptions are the preferred choice. However, in the multivariate case, most rank-based approaches have only been developed for complete observations. It is the aim of this work to develop asymptotic correct procedures that are capable of handling missing values, allowing for singular covariance matrices and are applicable for ordinal or ordered categorical data. This is achieved by applying a wild bootstrap procedure in combination with quadratic form-type test statistics. Beyond proving their asymptotic correctness, extensive simulation studies validate their applicability for small samples. Finally, two real data examples are analyzed.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vahid Nassiri, Fetene Tekle, Kanaka Tatikola, Helena Geys
{"title":"Addressing Class Imbalance in Bayesian Classification Through Posterior Probability Adjustment","authors":"Vahid Nassiri, Fetene Tekle, Kanaka Tatikola, Helena Geys","doi":"10.1002/bimj.70004","DOIUrl":"10.1002/bimj.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Class imbalance is a known issue in classification tasks that can lead to predictive bias toward dominant classes. This paper introduces a novel straightforward Bayesian framework that adjusts posterior probabilities to counteract the bias introduced by imbalanced data sets. Instead of relying on the mean posterior distribution of class probabilities, we propose a method that scales the posterior probability of each class according to their representation in the training data.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeji Kim, Taehwa Choi, Seohyeon Park, Sangbum Choi, Dipankar Bandyopadhyay
{"title":"Inverse-Weighted Quantile Regression With Partially Interval-Censored Data","authors":"Yeji Kim, Taehwa Choi, Seohyeon Park, Sangbum Choi, Dipankar Bandyopadhyay","doi":"10.1002/bimj.70001","DOIUrl":"10.1002/bimj.70001","url":null,"abstract":"<p>This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142632702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixture Cure Semiparametric Accelerated Failure Time Models With Partly Interval-Censored Data","authors":"Isabel Li, Jun Ma, Benoit Liquet","doi":"10.1002/bimj.202300203","DOIUrl":"10.1002/bimj.202300203","url":null,"abstract":"<div>\u0000 \u0000 <p>In practical survival analysis, the situation of no event for a patient can arise even after a long period of waiting time, which means a portion of the population may never experience the event of interest. Under this circumstance, one remedy is to adopt a mixture cure Cox model to analyze the survival data. However, if there clearly exhibits an acceleration (or deceleration) factor among their survival times, then an accelerated failure time (AFT) model will be preferred, leading to a mixture cure AFT model. In this paper, we consider a penalized likelihood method to estimate the mixture cure semiparametric AFT models, where the unknown baseline hazard is approximated using Gaussian basis functions. We allow partly interval-censored survival data which can include event times and left-, right-, and interval-censoring times. The penalty function helps to achieve a smooth estimate of the baseline hazard function. We will also provide asymptotic properties to the estimates so that inferences can be made on regression parameters and hazard-related quantities. Simulation studies are conducted to evaluate the model performance, which includes a comparative study with an existing method from the <span>smcure</span> <span>R</span> package. The results show that our proposed penalized likelihood method has acceptable performance in general and produces less bias when faced with the identifiability issue compared to <span>smcure</span>. To illustrate the application of our method, a real case study involving melanoma recurrence is conducted and reported. Our model is implemented in our R package <span>aftQnp</span> which is available from https://github.com/Isabellee4555/aftQnP.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Replication of Equivalence Studies","authors":"Charlotte Micheloud, Leonhard Held","doi":"10.1002/bimj.202300232","DOIUrl":"10.1002/bimj.202300232","url":null,"abstract":"<p>Replication studies are increasingly conducted to assess the credibility of scientific findings. Most of these replication attempts target studies with a superiority design, but there is a lack of methodology regarding the analysis of replication studies with alternative types of designs, such as equivalence. In order to fill this gap, we propose two approaches, the two-trials rule and the sceptical two one-sided tests (TOST) procedure, adapted from methods used in superiority settings. Both methods have the same overall Type-I error rate, but the sceptical TOST procedure allows replication success even for nonsignificant original or replication studies. This leads to a larger project power and other differences in relevant operating characteristics. Both methods can be used for sample size calculation of the replication study, based on the results from the original one. The two methods are applied to data from the Reproducibility Project: Cancer Biology.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300232","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Group Integrative Dynamic Factor Models With Application to Multiple Subject Brain Connectivity","authors":"Younghoon Kim, Zachary F. Fisher, Vladas Pipiras","doi":"10.1002/bimj.202300370","DOIUrl":"10.1002/bimj.202300370","url":null,"abstract":"<div>\u0000 \u0000 <p>This work introduces a novel framework for dynamic factor model-based group-level analysis of multiple subjects time-series data, called GRoup Integrative DYnamic factor (GRIDY) models. The framework identifies and characterizes intersubject similarities and differences between two predetermined groups by considering a combination of group spatial information and individual temporal dynamics. Furthermore, it enables the identification of intrasubject similarities and differences over time by employing different model configurations for each subject. Methodologically, the framework combines a novel principal angle-based rank selection algorithm and a noniterative integrative analysis framework. Inspired by simultaneous component analysis, this approach also reconstructs identifiable latent factor series with flexible covariance structures. The performance of the GRIDY models is evaluated through simulations conducted under various scenarios. An application is also presented to compare resting-state functional MRI data collected from multiple subjects in autism spectrum disorder and control groups.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Colicino, Roberto Ascari, Hachem Saddiki, Francheska Merced-Nieves, Nicolò Foppa Pedretti, Kathi Huddleston, Robert O Wright, Rosalind J Wright, Program Collaborators for Environmental Influences on Child Health Outcomes
{"title":"Cross-Cohort Mixture Analysis: A Data Integration Approach With Applications on Gestational Age and DNA-Methylation-Derived Gestational Age Acceleration Metrics","authors":"Elena Colicino, Roberto Ascari, Hachem Saddiki, Francheska Merced-Nieves, Nicolò Foppa Pedretti, Kathi Huddleston, Robert O Wright, Rosalind J Wright, Program Collaborators for Environmental Influences on Child Health Outcomes","doi":"10.1002/bimj.202300270","DOIUrl":"10.1002/bimj.202300270","url":null,"abstract":"<div>\u0000 \u0000 <p>Data integration of multiple studies can provide enhanced exposure contrast and statistical power to examine associations between environmental exposure mixtures and health outcomes. Extant research has combined populations and identified an overall mixture–outcome association, without accounting for differences across studies. We extended the Bayesian Weighted Quantile Sum (BWQS) regression to a hierarchical framework to analyze mixtures across cohorts. The hierarchical BWQS (HBWQS) approach aggregates sample size of multiple cohorts to calculate an overall mixture index, thereby identifying the most harmful exposure(s) across cohorts; and provides cohort-specific associations between the overall mixture index and the outcome. We showed results from 10 simulated scenarios including four mixture components in three, eight, and ten populations, and two real-case examples on the association between prenatal metal mixture exposure—comprising arsenic, cadmium, and lead—and both gestational age and epigenetic-derived gestational age acceleration metrics. Simulated scenarios showed good empirical coverage and little bias for all HBWQS-estimated parameters. The Watanabe–Akaike information criterion showed a better average performance for the HBWQS regression than the BWQS across scenarios. HBWQS results incorporating cohorts within the national Environmental influences on Child Health Outcomes (ECHO) program from three different sites showed that the environmental mixture was negatively associated with gestational age in a single site. The HBWQS approach facilitates the combination of multiple cohorts and accounts for individual cohort differences in mixture analyses. HBWQS findings can be used to develop regulations, policies, and interventions regarding multiple co-occurring environmental exposures and it will maximize the use of extant publicly available data.</p>\u0000 </div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}