BiometrikaPub Date : 2023-01-30DOI: 10.1093/biomet/asad006
Haihan Yu, Mark S Kaiser, Daniel J Nordman
{"title":"A subsampling perspective for extending the validity of state-of-the-art bootstraps in the frequency domain","authors":"Haihan Yu, Mark S Kaiser, Daniel J Nordman","doi":"10.1093/biomet/asad006","DOIUrl":"https://doi.org/10.1093/biomet/asad006","url":null,"abstract":"Summary Bootstrapping spectral mean statistics has been a notoriously difficult problem over the past 25 years. Many frequency domain bootstraps are valid only for certain time series structures, e.g., linear processes, or for special types of statistics, i.e., ratio statistics, because such bootstraps fail to capture the limiting variance of spectral statistics in general settings. We address this issue with a different form of resampling, namely, subsampling. While not considered previously, subsampling provides consistent variance estimation under much weaker conditions than any existing bootstrap in the frequency domain. Mixing is not used, as is often standard with subsampling. Rather, subsampling can be generally justified under the same conditions needed for original spectral mean statistics to have distributional limits in the first place. This result has impacts for other bootstrap methods. Subsampling then applies to extending the validity of recent state-of-the-art bootstraps in the frequency domain. We nontrivially link subsampling to such bootstraps, which broadens their range, as moment and block assumptions needed for these are cut by more than half. Essentially, state-of-the-art bootstraps then require no more stringent assumptions than those needed for a target limit distribution to exist, which is unusual in the bootstrap world. We also close a gap in the theory of subsampling for time series with distributional approximations, in addition to variance estimation, for frequency domain statistics.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135554424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-01-10DOI: 10.1093/biomet/asad001
Zhipeng Lou, Xianyang Zhang, Weichi Wu
{"title":"High Dimensional Analysis of Variance in Multivariate Linear Regression","authors":"Zhipeng Lou, Xianyang Zhang, Weichi Wu","doi":"10.1093/biomet/asad001","DOIUrl":"https://doi.org/10.1093/biomet/asad001","url":null,"abstract":"\u0000 In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new U type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general framework and theory can be applied to deal with the classical one-way multivariate analysis of variance and the nonparametric one-way multivariate analysis of variance in high dimensions. To implement the test procedure, we introduce a sample-splitting based estimator of the second moment of the error covariance and discuss its properties. A simulation study shows that our proposed test outperforms some existing tests in various settings.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47351243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-01-09DOI: 10.1093/biomet/asad005
Zhichao Jiang, Shizhe Chen, Peng Ding
{"title":"An instrumental variable method for point processes: generalized Wald estimation based on deconvolution","authors":"Zhichao Jiang, Shizhe Chen, Peng Ding","doi":"10.1093/biomet/asad005","DOIUrl":"https://doi.org/10.1093/biomet/asad005","url":null,"abstract":"\u0000 Point processes are probabilistic tools for modelling event data. While there exists a fast-growing literature studying the relationships between point processes, it remains unexplored how such relationships connect to causal effects. In the presence of unmeasured confounders, parameters from point process models do not necessarily have causal interpretations. We propose an instrumental variable method for causal inference with point process treatment and outcome. We define causal quantities based on potential outcomes and establish nonparametric identification results with a binary instrumental variable. We extend the traditional Wald estimation to deal with point process treatment and outcome, showing that it should be performed after a Fourier transform of the intention-to-treat effects on the treatment and outcome and thus takes the form of deconvolution. We term this generalized Wald estimation and propose an estimation strategy based on well-established deconvolution methods.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45747425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-12-01Epub Date: 2022-11-18DOI: 10.1093/biomet/asab059
Ian W McKeague, Xin Zhang
{"title":"Significance testing for canonical correlation analysis in high dimensions.","authors":"Ian W McKeague, Xin Zhang","doi":"10.1093/biomet/asab059","DOIUrl":"10.1093/biomet/asab059","url":null,"abstract":"<p><p>We consider the problem of testing for the presence of linear relationships between large sets of random variables based on a post-selection inference approach to canonical correlation analysis. The challenge is to adjust for the selection of subsets of variables having linear combinations with maximal sample correlation. To this end, we construct a stabilized one-step estimator of the euclidean-norm of the canonical correlations maximized over subsets of variables of pre-specified cardinality. This estimator is shown to be consistent for its target parameter and asymptotically normal, provided the dimensions of the variables do not grow too quickly with sample size. We also develop a greedy search algorithm to accurately compute the estimator, leading to a computationally tractable omnibus test for the global null hypothesis that there are no linear relationships between any subsets of variables having the pre-specified cardinality. We further develop a confidence interval that takes the variable selection into account.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"109 4","pages":"1067-1083"},"PeriodicalIF":2.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9857302/pdf/nihms-1771870.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10613294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-12-01DOI: 10.1093/biomet/asac007
C Huang, H Zhu
{"title":"Functional hybrid factor regression model for handling heterogeneity in imaging studies.","authors":"C Huang, H Zhu","doi":"10.1093/biomet/asac007","DOIUrl":"10.1093/biomet/asac007","url":null,"abstract":"<p><p>This paper develops a functional hybrid factor regression modelling framework to handle the heterogeneity of many large-scale imaging studies, such as the Alzheimer's disease neuroimaging initiative study. Despite the numerous successes of those imaging studies, such heterogeneity may be caused by the differences in study environment, population, design, protocols or other hidden factors, and it has posed major challenges in integrative analysis of imaging data collected from multicentres or multistudies. We propose both estimation and inference procedures for estimating unknown parameters and detecting unknown factors under our new model. The asymptotic properties of both estimation and inference procedures are systematically investigated. The finite-sample performance of our proposed procedures is assessed by using Monte Carlo simulations and a real data example on hippocampal surface data from the Alzheimer's disease study.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"109 4","pages":"1133-1148"},"PeriodicalIF":2.7,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9754099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10749215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-12-01Epub Date: 2022-02-16DOI: 10.1093/biomet/asac011
Jason Xu, Kenneth Lange
{"title":"A proximal distance algorithm for likelihood-based sparse covariance estimation.","authors":"Jason Xu, Kenneth Lange","doi":"10.1093/biomet/asac011","DOIUrl":"10.1093/biomet/asac011","url":null,"abstract":"<p><p>This paper addresses the task of estimating a covariance matrix under a patternless sparsity assumption. In contrast to existing approaches based on thresholding or shrinkage penalties, we propose a likelihood-based method that regularizes the distance from the covariance estimate to a symmetric sparsity set. This formulation avoids unwanted shrinkage induced by more common norm penalties, and enables optimization of the resulting nonconvex objective by solving a sequence of smooth, unconstrained subproblems. These subproblems are generated and solved via the proximal distance version of the majorization-minimization principle. The resulting algorithm executes rapidly, gracefully handles settings where the number of parameters exceeds the number of cases, yields a positive-definite solution, and enjoys desirable convergence properties. Empirically, we demonstrate that our approach outperforms competing methods across several metrics, for a suite of simulated experiments. Its merits are illustrated on international migration data and a case study on flow cytometry. Our findings suggest that the marginal and conditional dependency networks for the cell signalling data are more similar than previously concluded.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"1 1","pages":"1047-1066"},"PeriodicalIF":2.7,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10716840/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"60702732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-12-01DOI: 10.1093/biomet/asab061
Debangan Dey, Abhirup Datta, Sudipto Banerjee
{"title":"Graphical Gaussian Process Models for Highly Multivariate Spatial Data.","authors":"Debangan Dey, Abhirup Datta, Sudipto Banerjee","doi":"10.1093/biomet/asab061","DOIUrl":"https://doi.org/10.1093/biomet/asab061","url":null,"abstract":"<p><p>For multivariate spatial Gaussian process (GP) models, customary specifications of cross-covariance functions do not exploit relational inter-variable graphs to ensure process-level conditional independence among the variables. This is undesirable, especially for highly multivariate settings, where popular cross-covariance functions such as the multivariate Matérn suffer from a \"curse of dimensionality\" as the number of parameters and floating point operations scale up in quadratic and cubic order, respectively, in the number of variables. We propose a class of multivariate \"Graphical Gaussian Processes\" using a general construction called \"stitching\" that crafts cross-covariance functions from graphs and ensures process-level conditional independence among variables. For the Matérn family of functions, stitching yields a multivariate GP whose univariate components are Matérn GPs, and conforms to process-level conditional independence as specified by the graphical model. For highly multivariate settings and decomposable graphical models, stitching offers massive computational gains and parameter dimension reduction. We demonstrate the utility of the graphical Matérn GP to jointly model highly multivariate spatial data using simulation examples and an application to air-pollution modelling.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"109 4","pages":"993-1014"},"PeriodicalIF":2.7,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838617/pdf/nihms-1786615.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9104899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-11-10DOI: 10.1093/biomet/asac060
Minjie Wang, Genevera I. Allen
{"title":"Thresholded Graphical Lasso Adjusts for Latent Variables","authors":"Minjie Wang, Genevera I. Allen","doi":"10.1093/biomet/asac060","DOIUrl":"https://doi.org/10.1093/biomet/asac060","url":null,"abstract":"Structural learning of Gaussian graphical models in the presence of latent variables has long been a challenging problem. Chandrasekaran et al. (2012) proposed a convex program to estimate a sparse graph plus low-rank term that adjusts for latent variables; but, this approach poses challenges from both a computational and statistical perspective. We propose an alternative and incredibly simple solution: apply a hard thresholding operator to existing graph selection methods. Conceptually simple and computationally attractive, we show that thresholding the graphical lasso is graph selection consistent in the presence of latent variables under a simpler minimum edge strength condition and at an improved statistical rate. We also extend results to thresholded neighbourhood selection and CLIME estimators as well. We show that our simple thresholded graph estimators enjoy stronger empirical results than existing approaches for the latent variable graphical model problem and conclude with a neuroscience case study to estimate functional neural connections.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48628621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2022-09-29DOI: 10.1093/biomet/asac055
Z. Lin, H. Müller, B. U. Park
{"title":"Additive Models for Symmetric Positive-Definite Matrices and Lie Groups","authors":"Z. Lin, H. Müller, B. U. Park","doi":"10.1093/biomet/asac055","DOIUrl":"https://doi.org/10.1093/biomet/asac055","url":null,"abstract":"\u0000 We propose and investigate an additive regression model for symmetric positive-definite matrix valued responses and multiple scalar predictors. The model exploits the abelian group structure inherited from either of the log-Cholesky and log-Euclidean frameworks for symmetric positive-definite matrices and naturally extends to general abelian Lie groups. The proposed additive model is shown to connect to an additive model on a tangent space. This connection not only entails an efficient algorithm to estimate the component functions but also allows one to generalize the proposed additive model to general Riemannian manifolds. Optimal asymptotic convergence rates and normality of the estimated component functions are established and numerical studies show that the proposed model enjoys good numerical performance and is not subject to the curse of dimensionality when there are multiple predictors. The practical merits of the proposed model are demonstrated through an analysis of brain diffusion tensor imaging data.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47707225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}