Annals of StatisticsPub Date : 2025-12-01Epub Date: 2025-12-22DOI: 10.1214/25-aos2519
Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah
{"title":"COUNTERFACTUAL INFERENCE IN SEQUENTIAL EXPERIMENTS.","authors":"Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah","doi":"10.1214/25-aos2519","DOIUrl":"10.1214/25-aos2519","url":null,"abstract":"<p><p>We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale-mean outcome under different treatments <i>for each unit and each time</i>-with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to <math><mo>∞</mo></math> together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 6","pages":"2380-2406"},"PeriodicalIF":3.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145899144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA.","authors":"By Rui Miao, Babak Shahbaba, Annie Qu","doi":"10.1214/25-aos2512","DOIUrl":"10.1214/25-aos2512","url":null,"abstract":"<p><p>Offline reinforcement learning (RL) aims to find optimal policies in dynamic environments in order to maximize the expected total rewards by leveraging pre-collected data. Learning from heterogeneous data is one of the fundamental challenges in offline RL. Traditional methods focus on learning an optimal policy for all individuals with pre-collected data from a single episode or homogeneous batch episodes, and thus, may result in a suboptimal policy for a heterogeneous population. In this paper, we propose an individualized offline policy optimization framework for heterogeneous time-stationary Markov decision processes (MDPs). The proposed heterogeneous model with individual latent variables enables us to efficiently estimate the individual Q-functions, and our Penalized Pessimistic Personalized Policy Learning (P4L) algorithm guarantees a fast rate on the average regret under a weak partial coverage assumption on behavior policies. In addition, our simulation studies and a real data application demonstrate the superior numerical performance of the proposed method compared with existing methods.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 4","pages":"1513-1534"},"PeriodicalIF":3.7,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12439830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2025-02-01Epub Date: 2025-02-13DOI: 10.1214/24-aos2457
Satarupa Bhattacharjee, Bing Li, Lingzhou Xue
{"title":"NONLINEAR GLOBAL FRÉCHET REGRESSION FOR RANDOM OBJECTS VIA WEAK CONDITIONAL EXPECTATION.","authors":"Satarupa Bhattacharjee, Bing Li, Lingzhou Xue","doi":"10.1214/24-aos2457","DOIUrl":"10.1214/24-aos2457","url":null,"abstract":"<p><p>Random objects are complex non-Euclidean data taking values in general metric spaces, possibly devoid of any underlying vector space structure. Such data are becoming increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semidefinite matrices and data on Riemannian manifolds. However, except for regression for object-valued response with Euclidean predictors and distribution-on-distribution regression, there has been limited development of a general framework for object-valued response with object-valued predictors in the literature. To fill this gap, we introduce the notion of a weak conditional Fréchet mean based on Carleman operators and then propose a global nonlinear Fréchet regression model through the reproducing kernel Hilbert space (RKHS) embedding. Furthermore, we establish the relationships between the conditional Fréchet mean and the weak conditional Fréchet mean for both Euclidean and object-valued data. We also show that the state-of-the-art global Fréchet regression developed by Petersen and Müller (<i>Ann</i>. <i>Statist</i>. <b>47</b> (2019) 691-719) emerges as a special case of our method by choosing a linear kernel. We require that the metric space for the predictor admits a reproducing kernel, while the intrinsic geometry of the metric space for the response is utilized to study the asymptotic properties of the proposed estimates. Numerical studies, including extensive simulations and a real application, are conducted to investigate the finite-sample performance.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 1","pages":"117-143"},"PeriodicalIF":3.7,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407180/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144999566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules.","authors":"Xiang Li, Feng Ruan, Huiyuan Wang, Qi Long, Weijie J Su","doi":"10.1214/24-aos2468","DOIUrl":"10.1214/24-aos2468","url":null,"abstract":"<p><p>Since ChatGPT was introduced in November 2022, embedding (nearly) unnoticeable statistical signals into text generated by large language models (LLMs), also known as watermarking, has been used as a principled approach to provable detection of LLM-generated text from its human-written counterpart. In this paper, we introduce a general and flexible framework for reasoning about the statistical efficiency of watermarks and designing powerful detection rules. Inspired by the hypothesis testing formulation of watermark detection, our framework starts by selecting a pivotal statistic of the text and a secret key-provided by the LLM to the verifier-to control the false positive rate (the error of mistakenly detecting human-written text as LLM-generated). Next, this framework allows one to evaluate the power of watermark detection rules by obtaining a closed-form expression of the asymptotic false negative rate (the error of incorrectly classifying LLM-generated text as human-written). Our framework further reduces the problem of determining the optimal detection rule to solving a minimax optimization program. We apply this framework to two representative watermarks-one of which has been internally implemented at OpenAI-and obtain several findings that can be instrumental in guiding the practice of implementing watermarks. In particular, we derive optimal detection rules for these watermarks under our framework. These theoretically derived detection rules are demonstrated to be competitive and sometimes enjoy a higher power than existing detection approaches through numerical experiments.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 1","pages":"322-351"},"PeriodicalIF":3.7,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-10-01Epub Date: 2024-11-20DOI: 10.1214/24-aos2420
Alex Zhao, Changcheng Li, Runze Li, Zhe Zhang
{"title":"TESTING HIGH-DIMENSIONAL REGRESSION COEFFICIENTS IN LINEAR MODELS.","authors":"Alex Zhao, Changcheng Li, Runze Li, Zhe Zhang","doi":"10.1214/24-aos2420","DOIUrl":"10.1214/24-aos2420","url":null,"abstract":"<p><p>This paper is concerned with statistical inference for regression coefficients in high-dimensional linear regression models. We propose a new method for testing the coefficient vector of the high-dimensional linear models, and establish the asymptotic normality of our proposed test statistic with the aid of the martingale central limit theorem. We derive the asymptotical relative efficiency (ARE) of the proposed test with respect to the test proposed in Zhong and Chen (<i>J. Amer. Statist. Assoc.</i> <b>106</b> (2011) 260-274), and show that the ARE is always greater or equal to one under the local alternative studied in this paper. Our numerical studies imply that the proposed test with critical values derived from its asymptotical normal distribution may retain Type I error rate very well. Our numerical comparison demonstrates the proposed test performs better than existing ones in terms of powers. We further illustrate our proposed method with a real data example.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 5","pages":"2034-2058"},"PeriodicalIF":3.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12981489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147466814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-08-01Epub Date: 2024-10-03DOI: 10.1214/24-aos2422
Hongxiang Qiu, Eric Tchetgen Tchetgen, Edgar Dobriban
{"title":"EFFICIENT AND MULTIPLY ROBUST RISK ESTIMATION UNDER GENERAL FORMS OF DATASET SHIFT.","authors":"Hongxiang Qiu, Eric Tchetgen Tchetgen, Edgar Dobriban","doi":"10.1214/24-aos2422","DOIUrl":"https://doi.org/10.1214/24-aos2422","url":null,"abstract":"<p><p>Statistical machine learning methods often face the challenge of limited data available from the population of interest. One remedy is to leverage data from auxiliary source populations, which share some conditional distributions or are linked in other ways with the target domain. Techniques leveraging such <i>dataset shift</i> conditions are known as <i>domain adaptation</i> or <i>transfer learning</i>. Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population. In this paper, we study the general problem of efficiently estimating target population risk under various dataset shift conditions, leveraging semiparametric efficiency theory. We consider a general class of dataset shift conditions, which includes three popular conditions-covariate, label and concept shift-as special cases. We allow for partially nonoverlapping support between the source and target populations. We develop efficient and multiply robust estimators along with a straightforward specification test of these dataset shift conditions. We also derive efficiency bounds for two other dataset shift conditions, posterior drift and location-scale shift. Simulation studies support the efficiency gains due to leveraging plausible dataset shift conditions.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 4","pages":"1796-1824"},"PeriodicalIF":3.7,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13095157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147760338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-06-01Epub Date: 2024-08-11DOI: 10.1214/24-aos2378
Bingxin Zhao, Shurong Zheng, Hongtu Zhu
{"title":"ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS.","authors":"Bingxin Zhao, Shurong Zheng, Hongtu Zhu","doi":"10.1214/24-aos2378","DOIUrl":"10.1214/24-aos2378","url":null,"abstract":"<p><p>Genetic prediction holds immense promise for translating genetic discoveries into medical advances. As the high-dimensional covariance matrix (or the linkage disequilibrium (LD) pattern) of genetic variants often presents a block-diagonal structure, numerous methods account for the dependence among variants in predetermined local LD blocks. Moreover, due to privacy considerations and data protection concerns, genetic variant dependence in each LD block is typically estimated from external reference panels rather than the original training data set. This paper presents a unified analysis of blockwise and reference panel-based estimators in a high-dimensional prediction framework without sparsity restrictions. We find that, surprisingly, even when the covariance matrix has a block-diagonal structure with well-defined boundaries, blockwise estimation methods adjusting for local dependence can be substantially less accurate than methods controlling for the whole covariance matrix. Further, estimation methods built on the original training data set and external reference panels are likely to have varying performance in high dimensions, which may reflect the cost of having only access to summary level data from the training data set. This analysis is based on novel results in random matrix theory for block-diagonal covariance matrix. We numerically evaluate our results using extensive simulations and real data analysis in the UK Biobank.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 3","pages":"948-965"},"PeriodicalIF":3.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11391480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142279682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-04-01Epub Date: 2024-05-09DOI: 10.1214/24-aos2369
Edward H Kennedy, Sivaraman Balakrishnan, James M Robins, Larry Wasserman
{"title":"Minimax rates for heterogeneous causal effect estimation.","authors":"Edward H Kennedy, Sivaraman Balakrishnan, James M Robins, Larry Wasserman","doi":"10.1214/24-aos2369","DOIUrl":"10.1214/24-aos2369","url":null,"abstract":"<p><p>Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a Hölder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 2","pages":"793-816"},"PeriodicalIF":3.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11960818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-02-01Epub Date: 2024-03-07DOI: 10.1214/23-aos2339
Yeqing Zhou, Kai Xu, Liping Zhu, Runze Li
{"title":"RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN TWO HIGH-DIMENSIONAL VECTORS.","authors":"Yeqing Zhou, Kai Xu, Liping Zhu, Runze Li","doi":"10.1214/23-aos2339","DOIUrl":"10.1214/23-aos2339","url":null,"abstract":"<p><p>To test independence between two high-dimensional random vectors, we propose three tests based on the rank-based indices derived from Hoeffding's <math><mi>D</mi></math>, Blum-Kiefer-Rosenblatt's <math><mi>R</mi></math> and Bergsma-Dassios-Yanagimoto's <math><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math>. Under the null hypothesis of independence, we show that the distributions of the proposed test statistics converge to normal ones if the dimensions diverge arbitrarily with the sample size. We further derive an explicit rate of convergence. Thanks to the monotone transformation-invariant property, these distribution-free tests can be readily used to generally distributed random vectors including heavily tailed ones. We further study the local power of the proposed tests and compare their relative efficiencies with two classic distance covariance/correlation based tests in high dimensional settings. We establish explicit relationships between <math><mi>D</mi><mo>,</mo><mi>R</mi><mo>,</mo><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math> and Pearson's correlation for bivariate normal random variables. The relationships serve as a basis for power comparison. Our theoretical results show that under a Gaussian equicorrelation alternative, (i) the proposed tests are superior to the two classic distance covariance/correlation based tests if the components of random vectors have very different scales; (ii) the asymptotic efficiency of the proposed tests based on <math><mi>D</mi><mo>,</mo><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math> and <math><mi>R</mi></math> are sorted in a descending order.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 1","pages":"184-206"},"PeriodicalIF":3.2,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11064990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140849012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annals of StatisticsPub Date : 2024-02-01Epub Date: 2024-03-07DOI: 10.1214/23-aos2347
Wen Wang, Shihao Wu, Ziwei Zhu, Ling Zhou, Peter X-K Song
{"title":"SUPERVISED HOMOGENEITY FUSION: A COMBINATORIAL APPROACH.","authors":"Wen Wang, Shihao Wu, Ziwei Zhu, Ling Zhou, Peter X-K Song","doi":"10.1214/23-aos2347","DOIUrl":"10.1214/23-aos2347","url":null,"abstract":"<p><p>Fusing regression coefficients into homogeneous groups can unveil those coefficients that share a common value within each group. Such groupwise homogeneity reduces the intrinsic dimension of the parameter space and unleashes sharper statistical accuracy. We propose and investigate a new combinatorial grouping approach called <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion that is amenable to mixed integer optimization (MIO). On the statistical aspect, we identify a fundamental quantity called <i>MSE grouping sensitivity</i> that underpins the difficulty of recovering the true groups. We show that <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion achieves grouping consistency under the weakest possible requirement of the grouping sensitivity: if this requirement is violated, then the minimax risk of group misspecification will fail to converge to zero. Moreover, we show that in the high-dimensional regime, one can apply <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion with a sure screening set of features without any essential loss of statistical efficiency, while reducing the computational cost substantially. On the algorithmic aspect, we provide an MIO formulation for <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion along with a warm start strategy. Simulation and real data analysis demonstrate that <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion exhibits superiority over its competitors in terms of grouping accuracy.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 1","pages":"285-310"},"PeriodicalIF":3.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327361/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144793305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}