BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf083
Raanju R Sundararajan, Scott A Bruce
{"title":"Frequency band analysis of nonstationary multivariate time series.","authors":"Raanju R Sundararajan, Scott A Bruce","doi":"10.1093/biomtc/ujaf083","DOIUrl":"10.1093/biomtc/ujaf083","url":null,"abstract":"<p><p>Information from frequency bands in biomedical time series provides useful summaries of the observed signal. Many existing methods consider summaries of the time series obtained over a few well-known, pre-defined frequency bands of interest. However, there is a dearth of data-driven methods for identifying frequency bands that optimally summarize frequency-domain information in the time series. A new method to identify partition points in the frequency space of a multivariate locally stationary time series is proposed. These partition points signify changes across frequencies in the time-varying behavior of the signal and provide frequency band summary measures that best preserve nonstationary dynamics of the observed series. An $L_2$-norm based discrepancy measure that finds differences in the time-varying spectral density matrix is constructed, and its asymptotic properties are derived. New nonparametric bootstrap tests are also provided to identify significant frequency partition points and to identify components and cross-components of the spectral matrix exhibiting changes over frequencies. Finite-sample performance of the proposed method is illustrated via simulations. The proposed method is used to develop optimal frequency band summary measures for characterizing time-varying behavior in resting-state electroencephalography time series, as well as identifying components and cross-components associated with each frequency partition point.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12290460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf090
Francis K C Hui, Samuel Muller, Alan H Welsh
{"title":"Adjusted predictions for generalized estimating equations.","authors":"Francis K C Hui, Samuel Muller, Alan H Welsh","doi":"10.1093/biomtc/ujaf090","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf090","url":null,"abstract":"<p><p>Generalized estimating equations (GEEs) are a popular statistical method for longitudinal data analysis, requiring specification of the first 2 marginal moments of the response along with a working correlation matrix to capture temporal correlations within a cluster. When it comes to prediction at future/new time points using GEEs, a standard approach adopted by practitioners and software is to base it simply on the marginal mean model. In this article, we propose an alternative approach to prediction for independent cluster GEEs. By viewing the GEE as solving an iterative working linear model, we borrow ideas from universal kriging to construct an adjusted predictor that exploits working cross-correlations between the current and new observations within the same cluster. We establish theoretical conditions for the adjusted GEE predictor to outperform the standard GEE predictor. Simulations and an application to longitudinal data on the growth of sitka spruces demonstrate that, even when we misspecify the working correlation structure, adjusted GEE predictors can achieve better performance relative to standard GEE predictors, the so-called \"oracle\" GEE predictor using all time points, and potentially even cluster-specific predictions from a generalized linear mixed model.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf098
Matthew Pryce, Karla Diaz-Ordaz, Ruth H Keogh, Stijn Vansteelandt
{"title":"Causal machine learning for heterogeneous treatment effects in the presence of missing outcome data.","authors":"Matthew Pryce, Karla Diaz-Ordaz, Ruth H Keogh, Stijn Vansteelandt","doi":"10.1093/biomtc/ujaf098","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf098","url":null,"abstract":"<p><p>When estimating heterogeneous treatment effects, missing outcome data can complicate treatment effect estimation, causing certain subgroups of the population to be poorly represented. In this work, we discuss this commonly overlooked problem and consider the impact that missing at random outcome data has on causal machine learning estimators for the conditional average treatment effect (CATE). We propose 2 de-biased machine learning estimators for the CATE, the mDR-learner, and mEP-learner, which address the issue of under-representation by integrating inverse probability of censoring weights into the DR-learner and EP-learner, respectively. We show that under reasonable conditions, these estimators are oracle efficient and illustrate their favorable performance through simulated data settings, comparing them to existing CATE estimators, including comparison to estimators that use common missing data techniques. We present an example of their application using the GBSG2 trial, exploring treatment effect heterogeneity when comparing hormonal therapies to non-hormonal therapies among breast cancer patients post surgery, and offer guidance on the decisions a practitioner must make when implementing these estimators.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144752242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf087
Huu-Dinh Huynh, J Andrew Royle, Wen-Han Hwang
{"title":"A flexible framework for N-mixture occupancy models: applications to breeding bird surveys.","authors":"Huu-Dinh Huynh, J Andrew Royle, Wen-Han Hwang","doi":"10.1093/biomtc/ujaf087","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf087","url":null,"abstract":"<p><p>Estimating species abundance under imperfect detection is a key challenge in biodiversity conservation. The N-mixture model, widely recognized for its ability to distinguish between abundance and individual detection probability without marking individuals, is constrained by its stringent closure assumption, which leads to biased estimates when violated in real-world settings. To address this limitation, we propose an extended framework based on a development of the mixed Gamma-Poisson model, incorporating a community parameter that represents the proportion of individuals consistently present throughout the survey period. This flexible framework generalizes both the zero-inflated type occupancy model and the standard N-mixture model as special cases, corresponding to community parameter values of 0 and 1, respectively. The model's effectiveness is validated through simulations and applications to real-world datasets, specifically with 5 species from the North American Breeding Bird Survey and 46 species from the Swiss Breeding Bird Survey, demonstrating its improved accuracy and adaptability in settings where strict closure may not hold.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf091
Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike
{"title":"Correction to \"Propensity weighting plus adjustment in proportional hazards model is not doubly robust,\" by Erin E. Gabriel, Michael C. Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F. Blanche, Stijn Vansteelandt, Arvid Sjölander, and Thomas Scheike; Volume 80, Issue 3, September 2024, https://doi.org/10.1093/biomtc/ujae069.","authors":"Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike","doi":"10.1093/biomtc/ujaf091","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf091","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf092
Lingxiao Wang
{"title":"Using model-assisted calibration methods to improve efficiency of regression analyses using two-phase samples or pooled samples under complex survey designs.","authors":"Lingxiao Wang","doi":"10.1093/biomtc/ujaf092","DOIUrl":"10.1093/biomtc/ujaf092","url":null,"abstract":"<p><p>Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the \"pooled design\" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf094
John Neuhaus, Charles McCulloch, Ross Boylan
{"title":"Improved prediction and flagging of extreme random effects for non-Gaussian outcomes using weighted methods.","authors":"John Neuhaus, Charles McCulloch, Ross Boylan","doi":"10.1093/biomtc/ujaf094","DOIUrl":"10.1093/biomtc/ujaf094","url":null,"abstract":"<p><p>Investigators often focus on predicting extreme random effects from mixed effects models fitted to longitudinal or clustered data, and on identifying or \"flagging\" outliers such as poorly performing hospitals or rapidly deteriorating patients. Our recent work with Gaussian outcomes showed that weighted prediction methods can substantially reduce mean square error of prediction for extremes and substantially increase correct flagging rates compared to previous methods, while controlling the incorrect flagging rates. This paper extends the weighted prediction methods to non-Gaussian outcomes such as binary and count data. Closed-form expressions for predicted random effects and probabilities of correct and incorrect flagging are not available for the usual non-Gaussian outcomes, and the computational challenges are substantial. Therefore, our results include the development of theory to support algorithms that tune predictors that we call \"self-calibrated\" (which control the incorrect flagging rate using very simple flagging rules) and innovative numerical methods to calculate weighted predictors as well as to evaluate their performance. Comprehensive numerical evaluations show that the novel weighted predictors for non-Gaussian outcomes have substantially lower mean square error of prediction at the extremes and considerably higher correct flagging rates than previously proposed methods, while controlling the incorrect flagging rates. We illustrate our new methods using data on emergency room readmissions for children with asthma.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309285/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144741072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf088
Simon N Wood
{"title":"Simple simulation based reconstruction of incidence rates from death data.","authors":"Simon N Wood","doi":"10.1093/biomtc/ujaf088","DOIUrl":"10.1093/biomtc/ujaf088","url":null,"abstract":"<p><p>Daily deaths from an infectious disease provide a means for retrospectively inferring daily incidence, given knowledge of the infection-to-death interval distribution. Existing methods for doing so rely either on fitting simplified non-linear epidemic models to the deaths data or on spline based deconvolution approaches. The former runs the risk of introducing unintended artefacts via the model formulation, while the latter may be viewed as technically obscure, impeding uptake by practitioners. This note proposes a simple simulation based approach to inferring fatal incidence from deaths that requires minimal assumptions, is easy to understand, and allows testing of alternative hypothesized incidence trajectories. The aim is that in any future situation similar to the COVID pandemic, the method can be easily, rapidly, transparently, and uncontroversially deployed as an input to management.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf089
Fangting Zhou, Kejun He, Yang Ni
{"title":"Tree-based additive noise directed acyclic graphical models for nonlinear causal discovery with interactions.","authors":"Fangting Zhou, Kejun He, Yang Ni","doi":"10.1093/biomtc/ujaf089","DOIUrl":"10.1093/biomtc/ujaf089","url":null,"abstract":"<p><p>Directed acyclic graphical models with additive noises are essential in nonlinear causal discovery and have numerous applications in various domains, such as social science and systems biology. Most such models further assume that structural causal functions are additive to ensure causal identifiability and computational feasibility, which may be too restrictive in the presence of causal interactions. Some methods consider general nonlinear causal functions represented by, for example, Gaussian processes and neural networks, to accommodate interactions. However, they are either computationally intensive or lack interpretability. We propose a highly interpretable and computationally feasible approach using trees to incorporate interactions in nonlinear causal discovery, termed tree-based additive noise models. The nature of the tree construction leads to piecewise constant causal functions, making existing causal identifiability results of additive noise models with continuous and smooth causal functions inapplicable. Therefore, we provide new conditions under which the proposed model is identifiable. We develop a recursive algorithm for source node identification and a score-based ordering search algorithm. Through extensive simulations, we demonstrate the utility of the proposed model and algorithms benchmarking against existing additive noise models, especially when there are strong causal interactions. Our method is applied to infer a protein-protein interaction network for breast cancer, where proteins may form protein complexes to perform their functions.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288665/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144706199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-07-03DOI: 10.1093/biomtc/ujaf095
Guorong Dai, Raymond J Carroll, Jinbo Chen
{"title":"Valid and efficient inference for nonparametric variable importance in two-phase studies.","authors":"Guorong Dai, Raymond J Carroll, Jinbo Chen","doi":"10.1093/biomtc/ujaf095","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf095","url":null,"abstract":"<p><p>We consider a common nonparametric regression setting, where the data consist of a response variable Y, some easily obtainable covariates $mathbf {X}$, and a set of costly covariates $mathbf {Z}$. Before establishing predictive models for Y, a natural question arises: Is it worthwhile to include $mathbf {Z}$ as predictors, given the additional cost of collecting data on $mathbf {Z}$ for both training the models and predicting Y for future individuals? Therefore, we aim to conduct preliminary investigations to infer importance of $mathbf {Z}$ in predicting Y in the presence of $mathbf {X}$. To achieve this goal, we propose a nonparametric variable importance measure for $mathbf {Z}$. It is defined as a parameter that aggregates maximum potential contributions of $mathbf {Z}$ in single or multiple predictive models, with contributions quantified by general loss functions. Considering two-phase data that provide a large number of observations for $(Y,mathbf {X})$ with the expensive $mathbf {Z}$ measured only in a small subsample, we develop a novel approach to infer the proposed importance measure, accommodating missingness of $mathbf {Z}$ in the sample by substituting functions of $(Y,mathbf {X})$ for each individual's contribution to the predictive loss of models involving $mathbf {Z}$. Our approach attains unified and efficient inference regardless of whether $mathbf {Z}$ makes zero or positive contribution to predicting Y, a desirable yet surprising property owing to data incompleteness. As intermediate steps of our theoretical development, we establish novel results in two relevant research areas, semi-supervised inference and two-phase nonparametric estimation. Numerical results from both simulated and real data demonstrate superior performance of our approach.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144752245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}