{"title":"Phase transition and higher order analysis of <i>L<sub>q</sub></i> regularization under dependence.","authors":"Hanwen Huang, Peng Zeng, Qinglong Yang","doi":"10.1093/imaiai/iaae005","DOIUrl":"10.1093/imaiai/iaae005","url":null,"abstract":"<p><p>We study the problem of estimating a [Formula: see text]-sparse signal [Formula: see text] from a set of noisy observations [Formula: see text] under the model [Formula: see text], where [Formula: see text] is the measurement matrix the row of which is drawn from distribution [Formula: see text]. We consider the class of [Formula: see text]-regularized least squares (LQLS) given by the formulation [Formula: see text], where [Formula: see text] [Formula: see text] denotes the [Formula: see text]-norm. In the setting [Formula: see text] with fixed [Formula: see text] and [Formula: see text], we derive the asymptotic risk of [Formula: see text] for arbitrary covariance matrix [Formula: see text] that generalizes the existing results for standard Gaussian design, i.e. [Formula: see text]. The results were derived from the non-rigorous replica method. We perform a higher-order analysis for LQLS in the small-error regime in which the first dominant term can be used to determine the phase transition behavior of LQLS. Our results show that the first dominant term does not depend on the covariance structure of [Formula: see text] in the cases [Formula: see text] and [Formula: see text] which indicates that the correlations among predictors only affect the phase transition curve in the case [Formula: see text] a.k.a. LASSO. To study the influence of the covariance structure of [Formula: see text] on the performance of LQLS in the cases [Formula: see text] and [Formula: see text], we derive the explicit formulas for the second dominant term in the expansion of the asymptotic risk in terms of small error. Extensive computational experiments confirm that our analytical predictions are consistent with numerical results.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10878746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139933465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On statistical inference with high-dimensional sparse CCA.","authors":"Nilanjana Laha, Nathan Huey, Brent Coull, Rajarshi Mukherjee","doi":"10.1093/imaiai/iaad040","DOIUrl":"10.1093/imaiai/iaad040","url":null,"abstract":"<p><p>We consider asymptotically exact inference on the leading canonical correlation directions and strengths between two high-dimensional vectors under sparsity restrictions. In this regard, our main contribution is developing a novel representation of the Canonical Correlation Analysis problem, based on which one can operationalize a one-step bias correction on reasonable initial estimators. Our analytic results in this regard are adaptive over suitable structural restrictions of the high-dimensional nuisance parameters, which, in this set-up, correspond to the covariance matrices of the variables of interest. We further supplement the theoretical guarantees behind our procedures with extensive numerical studies.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138048165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Black-box tests for algorithmic stability.","authors":"Byol Kim, Rina Foygel Barber","doi":"10.1093/imaiai/iaad039","DOIUrl":"10.1093/imaiai/iaad039","url":null,"abstract":"<p><p>Algorithmic stability is a concept from learning theory that expresses the degree to which changes to the input data (e.g. removal of a single data point) may affect the outputs of a regression algorithm. Knowing an algorithm's stability properties is often useful for many downstream applications-for example, stability is known to lead to desirable generalization properties and predictive inference guarantees. However, many modern algorithms currently used in practice are too complex for a theoretical analysis of their stability properties, and thus we can only attempt to establish these properties through an empirical exploration of the algorithm's behaviour on various datasets. In this work, we lay out a formal statistical framework for this kind of <i>black-box testing</i> without any assumptions on the algorithm or the data distribution, and establish fundamental bounds on the ability of any black-box test to identify algorithmic stability.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10576650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41239705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian denoising of structured sources and its implications on learning-based denoising","authors":"Wenda Zhou, Joachim Wabnig, Shirin Jalali","doi":"10.1093/imaiai/iaad036","DOIUrl":"https://doi.org/10.1093/imaiai/iaad036","url":null,"abstract":"Abstract Denoising a stationary process $(X_{i})_{i in mathbb{Z}}$ corrupted by additive white Gaussian noise $(Z_{i})_{i in mathbb{Z}}$ is a classic, well-studied and fundamental problem in information theory and statistical signal processing. However, finding theoretically founded computationally efficient denoising methods applicable to general sources is still an open problem. In the Bayesian set-up where the source distribution is known, a minimum mean square error (MMSE) denoiser estimates $X^{n}$ from noisy measurements $Y^{n}$ as $hat{X}^{n}=mathrm{E}[X^{n}|Y^{n}]$. However, for general sources, computing $mathrm{E}[X^{n}|Y^{n}]$ is computationally very challenging, if not infeasible. In this paper, starting from a Bayesian set-up, a novel denoising method, namely, quantized maximum a posteriori (Q-MAP) denoiser is proposed and its asymptotic performance is analysed. Both for memoryless sources, and for structured first-order Markov sources, it is shown that, asymptotically, as $sigma _{z}^{2} $ (noise variance) converges to zero, ${1over sigma _{z}^{2}} mathrm{E}[(X_{i}-hat{X}^{mathrm{QMAP}}_{i})^{2}]$ converges to the information dimension of the source. For the studied memoryless sources, this limit is known to be optimal. A key advantage of the Q-MAP denoiser, unlike an MMSE denoiser, is that it highlights the key properties of the source distribution that are to be used in its denoising. This key property leads to a new learning-based denoising approach that is applicable to generic structured sources. Using ImageNet database for training, initial simulation results exploring the performance of such a learning-based denoiser in image denoising are presented.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135060543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Near-optimal estimation of linear functionals with log-concave observation errors","authors":"Simon Foucart, Grigoris Paouris","doi":"10.1093/imaiai/iaad038","DOIUrl":"https://doi.org/10.1093/imaiai/iaad038","url":null,"abstract":"Abstract This note addresses the question of optimally estimating a linear functional of an object acquired through linear observations corrupted by random noise, where optimality pertains to a worst-case setting tied to a symmetric, convex and closed model set containing the object. It complements the article ‘Statistical Estimation and Optimal Recovery’ published in the Annals of Statistics in 1994. There, Donoho showed (among other things) that, for Gaussian noise, linear maps provide near-optimal estimation schemes relatively to a performance measure relevant in Statistical Estimation. Here, we advocate for a different performance measure arguably more relevant in Optimal Recovery. We show that, relatively to this new measure, linear maps still provide near-optimal estimation schemes even if the noise is merely log-concave. Our arguments, which make a connection to the deterministic noise situation and bypass properties specific to the Gaussian case, offer an alternative to parts of Donoho’s proof.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135010731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based approximate message passing iterations","authors":"Cédric Gerbelot, Raphaël Berthier","doi":"10.1093/imaiai/iaad020","DOIUrl":"https://doi.org/10.1093/imaiai/iaad020","url":null,"abstract":"Abstract Approximate message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elaborate priors. In this paper, we address the following questions: is there a structure underlying all AMP iterations that unifies them in a common framework? Can we use such a structure to give a modular proof of state evolution equations, adaptable to new AMP iterations without reproducing each time the full argument? We propose an answer to both questions, showing that AMP instances can be generically indexed by an oriented graph. This enables to give a unified interpretation of these iterations, independent from the problem they solve, and a way of composing them arbitrarily. We then show that all AMP iterations indexed by such a graph verify rigorous SE equations, extending the reach of previous proofs and proving a number of recent heuristic derivations of those equations. Our proof naturally includes non-separable functions and we show how existing refinements, such as spatial coupling or matrix-valued variables, can be combined with our framework.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135110705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spectral deconvolution of matrix models: the additive case","authors":"Pierre Tarrago","doi":"10.1093/imaiai/iaad037","DOIUrl":"https://doi.org/10.1093/imaiai/iaad037","url":null,"abstract":"Abstract We implement a complex analytic method to build an estimator of the spectrum of a matrix perturbed by the addition of a random matrix noise in the free probabilistic regime. This method, which has been previously introduced by Arizmendi, Tarrago and Vargas, involves two steps: the first step consists in a fixed point method to compute the Stieltjes transform of the desired distribution in a certain domain, and the second step is a classical deconvolution by a Cauchy distribution, whose parameter depends on the intensity of the noise. This method thus reduces the spectral deconvolution problem to a classical one. We provide explicit bounds for the mean squared error of the first step under the assumption that the distribution of the noise is unitary invariant. In the case where the unknown measure is sparse or close to a distribution with a density with enough smoothness, we prove that the resulting estimator converges to the measure in the $1$-Wasserstein distance at speed $O(1/sqrt{N})$, where $N$ is the dimension of the matrix.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135109334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-dimensional asymptotics of Langevin dynamics in spiked matrix models","authors":"Tengyuan Liang, Subhabrata Sen, Pragya Sur","doi":"10.1093/imaiai/iaad042","DOIUrl":"https://doi.org/10.1093/imaiai/iaad042","url":null,"abstract":"Abstract We study Langevin dynamics for recovering the planted signal in the spiked matrix model. We provide a ‘path-wise’ characterization of the overlap between the output of the Langevin algorithm and the planted signal. This overlap is characterized in terms of a self-consistent system of integro-differential equations, usually referred to as the Crisanti–Horner–Sommers–Cugliandolo–Kurchan equations in the spin glass literature. As a second contribution, we derive an explicit formula for the limiting overlap in terms of the signal-to-noise ratio and the injected noise in the diffusion. This uncovers a sharp phase transition—in one regime, the limiting overlap is strictly positive, while in the other, the injected noise overcomes the signal, and the limiting overlap is zero.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135256563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-marginal Gromov–Wasserstein transport and barycentres","authors":"Florian Beier, Robert Beinert, Gabriele Steidl","doi":"10.1093/imaiai/iaad041","DOIUrl":"https://doi.org/10.1093/imaiai/iaad041","url":null,"abstract":"Abstract Gromov–Wasserstein (GW) distances are combinations of Gromov–Hausdorff and Wasserstein distances that allow the comparison of two different metric measure spaces (mm-spaces). Due to their invariance under measure- and distance-preserving transformations, they are well suited for many applications in graph and shape analysis. In this paper, we introduce the concept of multi-marginal GW transport between a set of mm-spaces as well as its regularized and unbalanced versions. As a special case, we discuss multi-marginal fused variants, which combine the structure information of an mm-space with label information from an additional label space. To tackle the new formulations numerically, we consider the bi-convex relaxation of the multi-marginal GW problem, which is tight in the balanced case if the cost function is conditionally negative definite. The relaxed model can be solved by an alternating minimization, where each step can be performed by a multi-marginal Sinkhorn scheme. We show relations of our multi-marginal GW problem to (unbalanced, fused) GW barycentres and present various numerical results, which indicate the potential of the concept.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135257493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Out-of-sample error estimation for M-estimators with convex penalty","authors":"Pierre C Bellec","doi":"10.1093/imaiai/iaad031","DOIUrl":"https://doi.org/10.1093/imaiai/iaad031","url":null,"abstract":"Abstract A generic out-of-sample error estimate is proposed for $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(boldsymbol{X},boldsymbol{y})$ is observed and the dimension $p$ and sample size $n$ are of the same order. The out-of-sample error estimate enjoys a relative error of order $n^{-1/2}$ in a linear model with Gaussian covariates and independent noise, either non-asymptotically when $p/nle gamma $ or asymptotically in the high-dimensional asymptotic regime $p/nto gamma ^{prime}in (0,infty )$. General differentiable loss functions $rho $ are allowed provided that the derivative of the loss is 1-Lipschitz; this includes the least-squares loss as well as robust losses such as the Huber loss and its smoothed versions. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the L1-penalized Huber M-estimator and the Lasso under a sparsity assumption and a bound on the number of contaminated observations. For the square loss and in the absence of corruption in the response, the results additionally yield $n^{-1/2}$-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty and arbitrary covariance, estimates that were previously known for the Lasso.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135258177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}