BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf083
Raanju R Sundararajan, Scott A Bruce
{"title":"Frequency band analysis of nonstationary multivariate time series.","authors":"Raanju R Sundararajan, Scott A Bruce","doi":"10.1093/biomtc/ujaf083","DOIUrl":"10.1093/biomtc/ujaf083","url":null,"abstract":"<p><p>Information from frequency bands in biomedical time series provides useful summaries of the observed signal. Many existing methods consider summaries of the time series obtained over a few well-known, pre-defined frequency bands of interest. However, there is a dearth of data-driven methods for identifying frequency bands that optimally summarize frequency-domain information in the time series. A new method to identify partition points in the frequency space of a multivariate locally stationary time series is proposed. These partition points signify changes across frequencies in the time-varying behavior of the signal and provide frequency band summary measures that best preserve nonstationary dynamics of the observed series. An $L_2$-norm based discrepancy measure that finds differences in the time-varying spectral density matrix is constructed, and its asymptotic properties are derived. New nonparametric bootstrap tests are also provided to identify significant frequency partition points and to identify components and cross-components of the spectral matrix exhibiting changes over frequencies. Finite-sample performance of the proposed method is illustrated via simulations. The proposed method is used to develop optimal frequency band summary measures for characterizing time-varying behavior in resting-state electroencephalography time series, as well as identifying components and cross-components associated with each frequency partition point.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12290460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf107
Wei Liu, Qingzhi Zhong
{"title":"High-dimensional multi-study multi-modality covariate-augmented generalized factor model.","authors":"Wei Liu, Qingzhi Zhong","doi":"10.1093/biomtc/ujaf107","DOIUrl":"10.1093/biomtc/ujaf107","url":null,"abstract":"<p><p>Latent factor models that integrate data from multiple sources/studies or modalities have garnered considerable attention across various disciplines. However, existing methods predominantly focus either on multi-study integration or multi-modality integration, rendering them insufficient for analyzing the diverse modalities measured across multiple studies. To address this limitation and cater to practical needs, we introduce a high-dimensional generalized factor model that seamlessly integrates multi-modality data from multiple studies, while also accommodating additional covariates. We conduct a thorough investigation of the identifiability conditions to enhance the model's interpretability. To tackle the complexity of high-dimensional nonlinear integration caused by 4 large latent random matrices, we utilize a variational lower bound to approximate the observed log-likelihood by employing a variational posterior distribution. By profiling the variational parameters, we establish the asymptotical properties of estimators for model parameters using M-estimation theory. Furthermore, we devise a computationally efficient variational expectation maximization (EM) algorithm to execute the estimation process and a criterion to determine the optimal number of both study-shared and study-specific factors. Extensive simulation studies and a real-world application show that the proposed method significantly outperforms existing methods in terms of estimation accuracy and computational efficiency.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144871261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf112
Belmiro P M Duarte, Anthony C Atkinson, Nuno M C Oliveira
{"title":"Model robust designs for dose-response models.","authors":"Belmiro P M Duarte, Anthony C Atkinson, Nuno M C Oliveira","doi":"10.1093/biomtc/ujaf112","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf112","url":null,"abstract":"<p><p>An optimal experimental design is a structured data collection plan aimed at maximizing the amount of information gathered. Determining an optimal experimental design, however, relies on the assumption that a predetermined model structure, relating the response and covariates, is known a priori. In practical scenarios, such as dose-response modeling, the form of the model representing the \"true\" relationship is frequently unknown, although there exists a finite set or pool of potential alternative models. Designing experiments based on a single model from this set may lead to inefficiency or inadequacy if the \"true\" model differs from that assumed when calculating the design. One approach to minimize the impact of the uncertainty in the model on the experimental plan is known as model robust design. In this context, we systematically address the challenge of finding approximate optimal model robust experimental designs. Our focus is on locally optimal designs, so allowing some of the models in the pool to be nonlinear. We present three Semidefinite Programming-based formulations, each aligned with one of the classes of model robustness criteria introduced by Läuter. These formulations exploit the semidefinite representability of the robustness criteria, leading to the representation of the robust problem as a semidefinite program. To ensure comparability of information measures across various models, we employ standardized designs. To illustrate the application of our approach, we consider a dose-response study where, initially, seven models were postulated as potential candidates to describe the dose-response relationship.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144941118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf113
Kai Chen, Yuqian Zhang
{"title":"Semi-supervised linear regression: enhancing efficiency and robustness in high dimensions.","authors":"Kai Chen, Yuqian Zhang","doi":"10.1093/biomtc/ujaf113","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf113","url":null,"abstract":"<p><p>In semi-supervised learning, the prevailing understanding suggests that observing additional unlabeled samples improves estimation accuracy for linear parameters only in the case of model misspecification. In this work, we challenge such a claim and show that additional unlabeled samples are beneficial in high-dimensional settings. Initially focusing on a dense scenario, we introduce robust semi-supervised estimators for the regression coefficient without relying on sparse structures in the population slope. Even when the true underlying model is linear, we show that leveraging information from large-scale unlabeled data helps reduce estimation bias, thereby improving both estimation accuracy and inference robustness. Moreover, we propose semi-supervised methods with further enhanced efficiency in scenarios with a sparse linear slope. The performance of the proposed methods is demonstrated through extensive numerical studies.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144941170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf090
Francis K C Hui, Samuel Muller, Alan H Welsh
{"title":"Adjusted predictions for generalized estimating equations.","authors":"Francis K C Hui, Samuel Muller, Alan H Welsh","doi":"10.1093/biomtc/ujaf090","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf090","url":null,"abstract":"<p><p>Generalized estimating equations (GEEs) are a popular statistical method for longitudinal data analysis, requiring specification of the first 2 marginal moments of the response along with a working correlation matrix to capture temporal correlations within a cluster. When it comes to prediction at future/new time points using GEEs, a standard approach adopted by practitioners and software is to base it simply on the marginal mean model. In this article, we propose an alternative approach to prediction for independent cluster GEEs. By viewing the GEE as solving an iterative working linear model, we borrow ideas from universal kriging to construct an adjusted predictor that exploits working cross-correlations between the current and new observations within the same cluster. We establish theoretical conditions for the adjusted GEE predictor to outperform the standard GEE predictor. Simulations and an application to longitudinal data on the growth of sitka spruces demonstrate that, even when we misspecify the working correlation structure, adjusted GEE predictors can achieve better performance relative to standard GEE predictors, the so-called \"oracle\" GEE predictor using all time points, and potentially even cluster-specific predictions from a generalized linear mixed model.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf098
Matthew Pryce, Karla Diaz-Ordaz, Ruth H Keogh, Stijn Vansteelandt
{"title":"Causal machine learning for heterogeneous treatment effects in the presence of missing outcome data.","authors":"Matthew Pryce, Karla Diaz-Ordaz, Ruth H Keogh, Stijn Vansteelandt","doi":"10.1093/biomtc/ujaf098","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf098","url":null,"abstract":"<p><p>When estimating heterogeneous treatment effects, missing outcome data can complicate treatment effect estimation, causing certain subgroups of the population to be poorly represented. In this work, we discuss this commonly overlooked problem and consider the impact that missing at random outcome data has on causal machine learning estimators for the conditional average treatment effect (CATE). We propose 2 de-biased machine learning estimators for the CATE, the mDR-learner, and mEP-learner, which address the issue of under-representation by integrating inverse probability of censoring weights into the DR-learner and EP-learner, respectively. We show that under reasonable conditions, these estimators are oracle efficient and illustrate their favorable performance through simulated data settings, comparing them to existing CATE estimators, including comparison to estimators that use common missing data techniques. We present an example of their application using the GBSG2 trial, exploring treatment effect heterogeneity when comparing hormonal therapies to non-hormonal therapies among breast cancer patients post surgery, and offer guidance on the decisions a practitioner must make when implementing these estimators.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144752242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf094
John Neuhaus, Charles McCulloch, Ross Boylan
{"title":"Improved prediction and flagging of extreme random effects for non-Gaussian outcomes using weighted methods.","authors":"John Neuhaus, Charles McCulloch, Ross Boylan","doi":"10.1093/biomtc/ujaf094","DOIUrl":"10.1093/biomtc/ujaf094","url":null,"abstract":"<p><p>Investigators often focus on predicting extreme random effects from mixed effects models fitted to longitudinal or clustered data, and on identifying or \"flagging\" outliers such as poorly performing hospitals or rapidly deteriorating patients. Our recent work with Gaussian outcomes showed that weighted prediction methods can substantially reduce mean square error of prediction for extremes and substantially increase correct flagging rates compared to previous methods, while controlling the incorrect flagging rates. This paper extends the weighted prediction methods to non-Gaussian outcomes such as binary and count data. Closed-form expressions for predicted random effects and probabilities of correct and incorrect flagging are not available for the usual non-Gaussian outcomes, and the computational challenges are substantial. Therefore, our results include the development of theory to support algorithms that tune predictors that we call \"self-calibrated\" (which control the incorrect flagging rate using very simple flagging rules) and innovative numerical methods to calculate weighted predictors as well as to evaluate their performance. Comprehensive numerical evaluations show that the novel weighted predictors for non-Gaussian outcomes have substantially lower mean square error of prediction at the extremes and considerably higher correct flagging rates than previously proposed methods, while controlling the incorrect flagging rates. We illustrate our new methods using data on emergency room readmissions for children with asthma.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309285/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144741072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf092
Lingxiao Wang
{"title":"Using model-assisted calibration methods to improve efficiency of regression analyses using two-phase samples or pooled samples under complex survey designs.","authors":"Lingxiao Wang","doi":"10.1093/biomtc/ujaf092","DOIUrl":"10.1093/biomtc/ujaf092","url":null,"abstract":"<p><p>Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the \"pooled design\" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf087
Huu-Dinh Huynh, J Andrew Royle, Wen-Han Hwang
{"title":"A flexible framework for N-mixture occupancy models: applications to breeding bird surveys.","authors":"Huu-Dinh Huynh, J Andrew Royle, Wen-Han Hwang","doi":"10.1093/biomtc/ujaf087","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf087","url":null,"abstract":"<p><p>Estimating species abundance under imperfect detection is a key challenge in biodiversity conservation. The N-mixture model, widely recognized for its ability to distinguish between abundance and individual detection probability without marking individuals, is constrained by its stringent closure assumption, which leads to biased estimates when violated in real-world settings. To address this limitation, we propose an extended framework based on a development of the mixed Gamma-Poisson model, incorporating a community parameter that represents the proportion of individuals consistently present throughout the survey period. This flexible framework generalizes both the zero-inflated type occupancy model and the standard N-mixture model as special cases, corresponding to community parameter values of 0 and 1, respectively. The model's effectiveness is validated through simulations and applications to real-world datasets, specifically with 5 species from the North American Breeding Bird Survey and 46 species from the Swiss Breeding Bird Survey, demonstrating its improved accuracy and adaptability in settings where strict closure may not hold.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf091
Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike
{"title":"Correction to \"Propensity weighting plus adjustment in proportional hazards model is not doubly robust,\" by Erin E. Gabriel, Michael C. Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F. Blanche, Stijn Vansteelandt, Arvid Sjölander, and Thomas Scheike; Volume 80, Issue 3, September 2024, https://doi.org/10.1093/biomtc/ujae069.","authors":"Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike","doi":"10.1093/biomtc/ujaf091","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf091","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}