{"title":"Inference in models with omitted covariates: Cramér-type moderate deviations and applications to high-dimensional regression","authors":"Rebecca M. Lewis, Heather S. Battey, Wen-Xin Zhou","doi":"10.1007/s10463-025-00935-y","DOIUrl":"10.1007/s10463-025-00935-y","url":null,"abstract":"<div><p>We study a score statistic for inference on an interest parameter in a linear model with omitted covariates, establishing Berry–Esseen and Cramér-type moderate deviation bounds on the associated normal approximation. This entails a coupling between well-behaved but unobservable random variables and observable ones to which standard results do not straightforwardly apply. The theory is of self-standing interest but also provides new insights on backwards reduction procedures used in high-dimensional regression. An example details how our results may be used to analyse the high-dimensional procedure proposed by Cox and Battey (<i>Proceedings of the National Academy of Sciences,</i> <b>114</b>, 8592–8595, 2017).</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 2","pages":"177 - 224"},"PeriodicalIF":0.6,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147352850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust empirical likelihood variable selection for the high dimensional single-index regression model","authors":"Huybrechts F. Bindele, Olivia Atutey","doi":"10.1007/s10463-025-00938-9","DOIUrl":"10.1007/s10463-025-00938-9","url":null,"abstract":"<div><p>A single-index regression model is considered, from which a robust and efficient inference about the model parameters is proposed. From a local linear approximation of the unknown regression function, such a function is estimated using the generalized signed-rank approach. Next considering the estimated function together with the estimating equation obtained from the generalized sign-rank objective function, a penalized empirical likelihood objective function of the index parameter is defined, from which its asymptotic distribution is established under mild regularity conditions. The performance of the proposed method is demonstrated via extensive Monte Carlo simulation experiments. The obtained simulation results are compared with those obtained from a normal approximation alternative and those obtained based on the least squares and least absolute deviations approaches. Finally, a real data example is given to illustrate the proposed methodology.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 1","pages":"93 - 113"},"PeriodicalIF":0.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exact two-sided confidence sets for a level set in simple linear regression","authors":"Fang Wan, Wei Liu, Frank Bretz","doi":"10.1007/s10463-025-00946-9","DOIUrl":"10.1007/s10463-025-00946-9","url":null,"abstract":"<div><p>Regression modeling is the workhorse of statistics. It is realized in recent years that one important aim in regression analysis may be the estimation of a level set of the regression function. The published work on this has thus far focused mainly on nonparametric regression, especially on point estimation. In our previous work, we constructed exact upper and lower, but only conservative two-sided, confidence sets for a level set in linear regression. In this paper, exact two-sided confidence sets are constructed in simple linear regression. A simultaneity property of the exact two-sided confidence is also studied. An example is given to illustrate the method.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 2","pages":"263 - 274"},"PeriodicalIF":0.6,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147352875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse quantile regression via (ell _0)-penalty","authors":"Toshio Honda, Wei-Ying Wu","doi":"10.1007/s10463-025-00941-0","DOIUrl":"10.1007/s10463-025-00941-0","url":null,"abstract":"<div><p>We consider model selection via <span>(ell _0)</span>-penalty for high-dimensional sparse quantile regression models. This procedure is almost equivalent to model selection via information criterion due to similarity in penalty. We deal with linear models, additive models, and varying coefficient models in a unified way and establish the model selection consistency results rigorously when the size of the relevant index set goes to infinity. The treatment of this situation is challenging and the theoretical novelty of our results is important because such information criteria are commonly used. We consider two different setups and propose tuning parameters in the <span>(ell _0)</span>-penalty. Besides, we propose a feasible algorithm for computation of our estimator and the numerical study results are presented.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 1","pages":"141 - 173"},"PeriodicalIF":0.6,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A greedy and optimistic clustering for leveraging individual covariate uncertainty","authors":"Akifumi Okuno, Kohei Hattori","doi":"10.1007/s10463-025-00947-8","DOIUrl":"10.1007/s10463-025-00947-8","url":null,"abstract":"<div><p>In this study, we examine a clustering problem where each individual element in a dataset has covariates associated with element-specific uncertainty. More specifically, we consider a clustering approach that preliminarily applies a non-linear transformation to the covariates, to capture the hidden data structure; we empirically approximate the sets representing the propagated uncertainty for the pre-processed features and propose a greedy and optimistic clustering (GOC) algorithm. This algorithm identifies better feature candidates within these sets, resulting in more condensed clusters. As a key application, we apply the GOC algorithm to synthetic datasets of the orbital properties of stars, generated through our numerical simulations that mimic the formation process of the Milky Way. The GOC algorithm demonstrates improved performance in identifying sibling stars originating from the same dwarf galaxy. These realistic datasets are also publicly available at https://github.com/oknakfm/GOC.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 2","pages":"275 - 296"},"PeriodicalIF":0.6,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147352756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of some (L_{2}) optimization to a discrete distribution","authors":"Jiwoong Kim","doi":"10.1007/s10463-025-00933-0","DOIUrl":"10.1007/s10463-025-00933-0","url":null,"abstract":"<div><p>This paper proposes a novel method to estimate the success probability of the binomial distribution. The proposed method employs the Cramer-von Mises type optimization which has been commonly used in estimating parameters of continuous distributions. Upon obtaining the estimator through the proposed method, its desirable properties, such as asymptotic distribution and robustness, are rigorously investigated. Simulation studies serve to demonstrate that the proposed method compares favorably with other well-celebrated methods, including the maximum likelihood method.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 1","pages":"43 - 67"},"PeriodicalIF":0.6,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust variable selection in high-dimensional nonparametric additive model","authors":"Suneel Babu Chatla, Abhijit Mandal","doi":"10.1007/s10463-025-00939-8","DOIUrl":"10.1007/s10463-025-00939-8","url":null,"abstract":"<div><p>Additive models are flexible nonparametric models. Finding the nonzero additive components when the true model is assumed to be sparse is an important problem and is well studied. The existing research focused on using the <span>(L_2)</span> loss function, which is sensitive to outliers in the data. We propose a new variable selection method for additive models that is robust to outliers in the data. It considers the framework of B-splines and density power divergence loss function for estimation, and employs a nonconcave penalty for variable selection. Our asymptotic results are derived under the sub-Weibull assumption, which allows the error distribution to have an exponentially heavy tail. Under regularity conditions, we show that the proposed method achieves the optimal convergence rate. Our results include the convergence rates for sub-Gaussian and sub-Exponential distributions as special cases. We numerically validate the theoretical findings using simulations and real data analysis.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 1","pages":"115 - 140"},"PeriodicalIF":0.6,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Posterior contraction rate and asymptotic Bayes optimality for one group global–local shrinkage priors in sparse normal means problem","authors":"Sayantan Paul, Arijit Chakrabarti","doi":"10.1007/s10463-025-00932-1","DOIUrl":"10.1007/s10463-025-00932-1","url":null,"abstract":"<div><p>We study inference on the mean vector of the normal means model in sparse asymptotic settings when it is modelled by broad classes of one-group global–local continuous shrinkage priors. We prove that the resulting posterior distributions contract around the truth at a near minimax rate with respect to squared <span>(L_2)</span> loss when the global shrinkage parameter is estimated in empirical Bayesian ways or arbitrary priors supported on some appropriate interval are assigned to it. We then employ an intuitive multiple testing rule (using full Bayes treatment with global–local priors) in a problem of simultaneous testing (with additive misclassification loss) for the components of the mean assuming they are iid from a two-groups prior. In a first result of its kind, risk of our testing rule is shown to asymptotically match (up to a constant) that of the optimal rule in the two-groups setting. </p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"77 5","pages":"787 - 819"},"PeriodicalIF":0.6,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144923235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Localization of moving poisson source on the plane","authors":"O. V. Chernoyarov, S. Dachian, Y. A. Kutoyants","doi":"10.1007/s10463-025-00937-w","DOIUrl":"10.1007/s10463-025-00937-w","url":null,"abstract":"<div><p>Two problems of Poissonian source localization by <i>K</i> detectors on the plane are considered. It is supposed that the intensities of the received processes are decreasing functions of the distances between the source and the detectors. In the first problem the source is fixed at some point on the plane, and in the second problem it is supposed that the source is moving along some line. In both problems the properties of the least squares estimators, MLE, Bayesian estimators, One-step MLE and One-step MLE-processes are described. The regularity conditions are proposed which allow to prove the consistency and the asymptotic normality of all the estimators. Note that One-step MLE-processes allows <i>on-line</i> tracking of the moving source.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"78 1","pages":"69 - 92"},"PeriodicalIF":0.6,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selection-bias-adjusted inference for the bivariate normal distribution under soft-threshold sampling","authors":"Joseph B. Lang","doi":"10.1007/s10463-025-00925-0","DOIUrl":"10.1007/s10463-025-00925-0","url":null,"abstract":"<div><p>The problem of estimating parameters and predicting outcomes of a bivariate Normal distribution is more challenging when, owing to data-dependent selection (or missingness or dropout), the available data are not a representative sample of bivariate realizations. This problem is addressed using an observation model that is induced by a combination of a multivariate Normal “science” model and a realistic “soft-threshold selection” model with unknown truncation point. This observation model, which is expressed using an intuitive selection subset notation, is a generalization of existing “hard-threshold” models. It affords simple-to-compute selection-bias-adjusted estimates of both the regression (conditional mean) parameters and the bivariate correlation. In addition, a simple bootstrap approach for computing both confidence and prediction intervals in the soft-threshold selection setting is described. Simulation results are promising. To motivate this research, two illustrative examples describe a setting where selection bias is an issue of concern.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"77 4","pages":"597 - 625"},"PeriodicalIF":0.6,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}