BiometrikaPub Date : 2024-07-01DOI: 10.1093/biomet/asae033
Lachlan C Astfalck, Adam M Sykulski, Edward J Cripps
{"title":"Debiasing Welch’s Method for Spectral Density Estimation","authors":"Lachlan C Astfalck, Adam M Sykulski, Edward J Cripps","doi":"10.1093/biomet/asae033","DOIUrl":"https://doi.org/10.1093/biomet/asae033","url":null,"abstract":"Summary Welch’s method provides an estimator of the power spectral density that is statistically consistent. This is achieved by averaging over periodograms calculated from overlapping segments of a time series. For a finite length time series, while the variance of the estimator decreases as the number of segments increase, the magnitude of the estimator’s bias increases: a bias-variance trade-off ensues when setting the segment number. We address this issue by providing a novel method for debiasing Welch’s method which maintains the computational complexity and asymptotic consistency, and leads to improved finite-sample performance. Theoretical results are given for fourth-order stationary processes with finite fourth-order moments and absolutely convergent fourth-order cumulant function. The significant bias reduction is demonstrated with numerical simulation and an application to real-world data. Our estimator also permits irregular spacing over frequency and we demonstrate how this may be employed for signal compression and further variance reduction. Code accompanying this work is available in R and python.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141517612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-06-22DOI: 10.1093/biomet/asae027
Keyao Wei, Lengyang Wang, Yingcun Xia
{"title":"Testing serial dependence or cross dependence for time series with underreporting","authors":"Keyao Wei, Lengyang Wang, Yingcun Xia","doi":"10.1093/biomet/asae027","DOIUrl":"https://doi.org/10.1093/biomet/asae027","url":null,"abstract":"In practice, it is common for collected data to be underreported, which is particularly prevalent in fields such as social sciences, ecology and epidemiology. Drawing inferences from such data using conventional statistical methods can lead to incorrect conclusions. In this paper, we study tests for serial or cross dependence in time series data that are subject to underreporting. We introduce new test statistics, develop corresponding group-of-blocks bootstrap techniques, and establish their consistency. The methods are shown to be efficient by simulation and are used to identify key factors responsible for the spread of dengue fever and the occurrence of cardiovascular disease.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-05-24DOI: 10.1093/biomet/asae024
Shaoxin Hong, Jiancheng Jiang, Xuejun Jiang, Haofeng Wang
{"title":"Inference for possibly misspecified generalized linear models with nonpolynomial-dimensional nuisance parameters","authors":"Shaoxin Hong, Jiancheng Jiang, Xuejun Jiang, Haofeng Wang","doi":"10.1093/biomet/asae024","DOIUrl":"https://doi.org/10.1093/biomet/asae024","url":null,"abstract":"\u0000 It is routine practice in statistical modelling to first select variables and then make inference for the selected model as in stepwise regression. Such inference is made upon the assumption that the selected model is true. However, without this assumption, one would not know the validity of the inference. Similar problems also exist in high dimensional regression with regularization. To address these problems, we propose a dimension-reduced generalized likelihood ratio test for generalized linear models with nonpolynomial dimensionality, based on the quasilikelihood estimation which allows for misspecification of the conditional variance. The test has nearly oracle performance when using the correct amount of shrinkage and has robust performance against the choice of regularization parameter across a large range. We further develop an adaptive data-driven dimension-reduced generalized likelihood ratio test and prove that with probability going to one it is an oracle generalized likelihood ratio test. However, in ultrahigh-dimensional models the penalized estimation may produce spuriously important variables which deteriorate the performance of test. To tackle this problem, we introduce a cross-fitted dimension-reduced generalized likelihood ratio test, which is not only free of spurious effects but robust against the choice of regularization parameter. We establish limiting distributions of the proposed tests. Their advantages are highlighted via theoretical and empirical comparisons to some competitive tests. An application to breast cancer data illustrates the use of our proposed methodology.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141100208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-05-13DOI: 10.1093/biomet/asae023
Alexander Henzi, Michael Law
{"title":"A Rank-Based Sequential Test of Independence","authors":"Alexander Henzi, Michael Law","doi":"10.1093/biomet/asae023","DOIUrl":"https://doi.org/10.1093/biomet/asae023","url":null,"abstract":"Summary We consider the problem of independence testing for two univariate random variables in a sequential setting. By leveraging recent developments on safe, anytime-valid inference, we propose a test with time-uniform type I error control and derive explicit bounds on the finite sample performance of the test. We demonstrate the empirical performance of the procedure in comparison to existing sequential and non-sequential independence tests. Furthermore, since the proposed test is distribution free under the null hypothesis, we empirically simulate the gap due to Ville’s inequality–the supermartingale analogue of Markov’s inequality–that is commonly applied to control type I error in anytime-valid inference, and apply this to construct a truncated sequential test.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-05-05DOI: 10.1093/biomet/asae022
Cheng-Han Yang, Yu-Jen Cheng
{"title":"A model-free variable screening method for optimal treatment regimes with high-dimensional survival data","authors":"Cheng-Han Yang, Yu-Jen Cheng","doi":"10.1093/biomet/asae022","DOIUrl":"https://doi.org/10.1093/biomet/asae022","url":null,"abstract":"Summary We propose a model-free variable screening method for the optimal treatment regime with high-dimensional survival data. The proposed screening method provides a unified framework to select the active variables in a prespecified target population, including the treated group as a special case. Based on this framework, the optimal treatment regime is exactly the optimal classifier that minimizes a weighted misclassification error rate, with weights associated with survival outcome variables, the censoring distribution, and a prespecified target population. Our main contribution involves reformulating the weighted classification problem into a classification problem within a hypothetical population, where the observed data can be viewed as a sample obtained from outcome-dependent sampling, with the selection probability inversely proportional to the weights. Consequently, we introduce the weighted Kolmogorov–Smirnov approach for selecting active variables in the optimal treatment regime, extending the conventional Kolmogorov–Smirnov method for binary classification. Additionally, the proposed screening method exhibits two levels of robustness. The first level of robustness is achieved because the proposed method does not require any model assumptions for survival outcome on treatment and covariates, whereas the other is attained as the form of treatment regimes is allowed to be unspecified even without requiring convex surrogate loss, such as logit loss or hinge loss. As a result, the proposed screening method is robust to model misspecifications, and nonparametric learning methods such as random forests and boosting can be applied to those selected variables for further analysis. The theoretical properties of the proposed method are established. The performance of the proposed method is examined through simulation studies and illustrated by a real dataset.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-04-13DOI: 10.1093/biomet/asae021
Jeffrey Zhang, Dylan S Small, Siyu Heng
{"title":"Sensitivity analysis for matched observational studies with continuous exposures and binary outcomes","authors":"Jeffrey Zhang, Dylan S Small, Siyu Heng","doi":"10.1093/biomet/asae021","DOIUrl":"https://doi.org/10.1093/biomet/asae021","url":null,"abstract":"Summary Matching is one of the most widely used study designs for adjusting for measured confounders in observational studies. However, unmeasured confounding may exist and cannot be removed by matching. Therefore, a sensitivity analysis is typically needed to assess a causal conclusion’s sensitivity to unmeasured confounding. Sensitivity analysis frameworks for binary exposures have been well-established for various matching designs and are commonly used in various studies. However, unlike the binary exposure case, there still lacks valid and general sensitivity analysis methods for continuous exposures, except in some special cases such as pair matching. To fill this gap in the binary outcome case, we develop a sensitivity analysis framework for general matching designs with continuous exposures and binary outcomes. First, we use probabilistic lattice theory to show our sensitivity analysis approach is finite-population- exact under Fisher’s sharp null. Second, we prove a novel design sensitivity formula as a powerful tool for asymptotically evaluating the performance of our sensitivity analysis approach. Third, to allow effect heterogeneity with binary outcomes, we introduce a framework for conducting asymptotically exact inference and sensitivity analysis on generalized attributable effects with binary outcomes via mixed- integer programming. Fourth, for the continuous outcomes case, we show that conducting an asymptotically exact sensitivity analysis in matched observational studies when both the exposures and outcomes are continuous is generally NP-hard, except in some special cases such as pair matching. As a real data application, we apply our new methods to study the effect of early-life lead exposure on juvenile delinquency. An implementation of the methods in this work is available in the R package doseSens.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-04-11DOI: 10.1093/biomet/asae020
Erin E Gabriel, Michael C Sachs, Andreas Kryger Jensen
{"title":"Sharp symbolic nonparametric bounds for measures of benefit in observational and imperfect randomized studies with ordinal outcomes","authors":"Erin E Gabriel, Michael C Sachs, Andreas Kryger Jensen","doi":"10.1093/biomet/asae020","DOIUrl":"https://doi.org/10.1093/biomet/asae020","url":null,"abstract":"Summary The probability of benefit is a valuable and meaningful measure of treatment effect, which has advantages over the average treatment effect. Particularly for an ordinal outcome, it has a better interpretation and can make apparent different aspects of the treatment impact. Unfortunately, this measure, and variations of it, are not identifiable even in randomized trials with perfect compliance. There is, for this reason, a long literature on nonparametric bounds for unidentifiable measures of benefit. These have primarily focused on perfect randomized trial settings and one or two specific estimands. We expand these bounds to observational settings with unmeasured confounders and imperfect randomized trials for all three estimands considered in the literature: the probability of benefit, the probability of no harm, and the relative treatment effect.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-04-08DOI: 10.1093/biomet/asae015
J Zhang, F Xue, Q Xu, J Lee, A Qu
{"title":"Individualized dynamic model for multi-resolutional data","authors":"J Zhang, F Xue, Q Xu, J Lee, A Qu","doi":"10.1093/biomet/asae015","DOIUrl":"https://doi.org/10.1093/biomet/asae015","url":null,"abstract":"SUMMARY Mobile health has emerged as a major success for tracking individual health status, due to the popularity and power of smartphones and wearable devices. This has also brought great challenges in handling heterogeneous, multi-resolution data which arise ubiquitously in mobile health due to irregular multivariate measurements collected from individuals. In this paper, we propose an individualized dynamic latent factor model for irregular multi-resolution time series data to interpolate unsampled measurements of time series with low resolution. One major advantage of the proposed method is the capability to integrate multiple irregular time series and multiple subjects by mapping the multi-resolution data to the latent space. In addition, the proposed individualized dynamic latent factor model is applicable to capturing heterogeneous longitudinal information through individualized dynamic latent factors. Our theory provides a bound on the integrated interpolation error and the convergence rate for B-spline approximation methods. Both the simulation studies and the application to smartwatch data demonstrate the superior performance of the proposed method compared to existing methods.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-04-05DOI: 10.1093/biomet/asae009
{"title":"Correction to: ‘Nonparametric efficient causal mediation with intermediate confounders’","authors":"","doi":"10.1093/biomet/asae009","DOIUrl":"https://doi.org/10.1093/biomet/asae009","url":null,"abstract":"","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140736660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2024-03-23DOI: 10.1093/biomet/asae018
Jesse Hemerik, Aldo Solari, Jelle J Goeman
{"title":"Flexible control of the median of the false discovery proportion","authors":"Jesse Hemerik, Aldo Solari, Jelle J Goeman","doi":"10.1093/biomet/asae018","DOIUrl":"https://doi.org/10.1093/biomet/asae018","url":null,"abstract":"We introduce a multiple testing procedure that controls the median of the proportion of false discoveries in a flexible way. The procedure only requires a vector of p-values as input and is comparable to the Benjamini–Hochberg method, which controls the mean of the proportion of false discoveries. Our method allows free choice of one or several values of alpha after seeing the data, unlike the Benjamini–Hochberg procedure, which can be very anti-conservative when alpha is chosen post hoc. We prove these claims and illustrate them with simulations. Our procedure is inspired by a popular estimator of the total number of true hypotheses. We adapt this estimator to provide simultaneously median unbiased estimators of the proportion of false discoveries, valid for finite samples. This simultaneity allows for the claimed flexibility. Our approach does not assume independence. The time complexity of our method is linear in the number of hypotheses, after sorting the p-values.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140199969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}