{"title":"Freeness over the diagonal and outliers detection in deformed random matrices with a variance profile","authors":"Jérémie Bigot;Camille Male","doi":"10.1093/imaiai/iaaa012","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa012","url":null,"abstract":"We study the eigenvalue distribution of a Gaussian unitary ensemble (GUE) matrix with a variance profile that is perturbed by an additive random matrix that may possess spikes. Our approach is guided by Voiculescu's notion of freeness with amalgamation over the diagonal and by the notion of deterministic equivalent. This allows to derive a fixed point equation to approximate the spectral distribution of certain deformed GUE matrices with a variance profile and to characterize the location of potential outliers in such models in a non-asymptotic setting. We also consider the singular values distribution of a rectangular Gaussian random matrix with a variance profile in a similar setting of additive perturbation. We discuss the application of this approach to the study of low-rank matrix denoising models in the presence of heteroscedastic noise, that is when the amount of variance in the observed data matrix may change from entry to entry. Numerical experiments are used to illustrate our results. Deformed random matrix, Variance profile, Outlier detection, Free probability, Freeness with amalgamation, Operator-valued Stieltjes transform, Gaussian spiked model, Low-rank model. 2000 Math Subject Classification: 62G05, 62H12.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"863-919"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50224073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonlinear generalization of the monotone single index model","authors":"Željko Kereta;Timo Klock;Valeriya Naumova","doi":"10.1093/imaiai/iaaa013","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa013","url":null,"abstract":"Single index model is a powerful yet simple model, widely used in statistics, machine learning and other scientific fields. It models the regression function as \u0000<tex>$g(left <{a},{x}right>)$</tex>\u0000, where \u0000<tex>$a$</tex>\u0000 is an unknown index vector and \u0000<tex>$x$</tex>\u0000 are the features. This paper deals with a nonlinear generalization of this framework to allow for a regressor that uses multiple index vectors, adapting to local changes in the responses. To do so, we exploit the conditional distribution over function-driven partitions and use linear regression to locally estimate index vectors. We then regress by applying a k-nearest neighbor-type estimator that uses a localized proxy of the geodesic metric. We present theoretical guarantees for estimation of local index vectors and out-of-sample prediction and demonstrate the performance of our method with experiments on synthetic and real-world data sets, comparing it with state-of-the-art methods.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"987-1029"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50347110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate separability of symmetrically penalized least squares in high dimensions: characterization and consequences","authors":"Michael Celentano","doi":"10.1093/imaiai/iaaa037","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa037","url":null,"abstract":"We show that the high-dimensional behavior of symmetrically penalized least squares with a possibly non-separable, symmetric, convex penalty in both (i) the Gaussian sequence model and (ii) the linear model with uncorrelated Gaussian designs nearly agrees with the behavior of least squares with an appropriately chosen separable penalty in these same models. This agreement is established by finite-sample concentration inequalities which precisely characterize the behavior of symmetrically penalized least squares in both models via a comparison to a simple scalar statistical model. The concentration inequalities are novel in their precision and generality. Our results help clarify that the role non-separability can play in high-dimensional M-estimation. In particular, if the empirical distribution of the coordinates of the parameter is known—exactly or approximately—there are at most limited advantages to use non-separable, symmetric penalties over separable ones. In contrast, if the empirical distribution of the coordinates of the parameter is unknown, we argue that non-separable, symmetric penalties automatically implement an adaptive procedure, which we characterize. We also provide a partial converse which characterizes the adaptive procedures which can be implemented in this way.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"1105-1165"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa037","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50347113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-reference factor analysis: low-rank covariance estimation under unknown translations","authors":"Boris Landa;Yoel Shkolnisky","doi":"10.1093/imaiai/iaaa019","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa019","url":null,"abstract":"We consider the problem of estimating the covariance matrix of a random signal observed through unknown translations (modeled by cyclic shifts) and corrupted by noise. Solving this problem allows to discover low-rank structures masked by the existence of translations (which act as nuisance parameters), with direct application to principal components analysis. We assume that the underlying signal is of length \u0000<tex>$L$</tex>\u0000 and follows a standard factor model with mean zero and \u0000<tex>$r$</tex>\u0000 normally distributed factors. To recover the covariance matrix in this case, we propose to employ the second- and fourth-order shift-invariant moments of the signal known as the power spectrum and the trispectrum. We prove that they are sufficient for recovering the covariance matrix (under a certain technical condition) when \u0000<tex>$r<sqrt{L}$</tex>\u0000. Correspondingly, we provide a polynomial-time procedure for estimating the covariance matrix from many (translated and noisy) observations, where no explicit knowledge of \u0000<tex>$r$</tex>\u0000 is required, and prove the procedure's statistical consistency. While our results establish that covariance estimation is possible from the power spectrum and the trispectrum for low-rank covariance matrices, we prove that this is not the case for full-rank covariance matrices. We conduct numerical experiments that corroborate our theoretical findings and demonstrate the favourable performance of our algorithms in various settings, including in high levels of noise.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"773-812"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50224071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The generalized orthogonal Procrustes problem in the high noise regime","authors":"Thomas Pumir;Amit Singer;Nicolas Boumal","doi":"10.1093/imaiai/iaaa035","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa035","url":null,"abstract":"We consider the problem of estimating a cloud of points from numerous noisy observations of that cloud after unknown rotations and possibly reflections. This is an instance of the general problem of estimation under group action, originally inspired by applications in three-dimensional imaging and computer vision. We focus on a regime where the noise level is larger than the magnitude of the signal, so much so that the rotations cannot be estimated reliably. We propose a simple and efficient procedure based on invariant polynomials (effectively: the Gram matrices) to recover the signal, and we assess it against fundamental limits of the problem that we derive. We show our approach adapts to the noise level and is statistically optimal (up to constants) for both the low and high noise regimes. In studying the variance of our estimator, we encounter the question of the sensivity of a type of thin Cholesky factorization, for which we provide an improved bound which may be of independent interest.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"921-954"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50224074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical risk minimization for dynamical systems and stationary processes","authors":"Kevin McGoff;Andrew B Nobel","doi":"10.1093/imaiai/iaaa043","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa043","url":null,"abstract":"We introduce and analyze a general framework for empirical risk minimization in which the observations and models of interest may be stationary systems or processes. Within the framework, which is presented in terms of dynamical systems, empirical risk minimization can be studied as a two-step procedure in which (i) the trajectory of an observed (but unknown) system is fit by a trajectory of a known reference system via minimization of cumulative per-state loss, and (ii) an invariant parameter estimate is obtained from the initial state of the best fit trajectory. We show that the weak limits of the empirical measures of best-matched trajectories are dynamically invariant couplings (joinings) of the observed and reference systems with minimal risk. Moreover, we establish that the family of risk-minimizing joinings is convex and compact and that it fully characterizes the asymptotic behavior of the estimated parameters, directly addressing identifiability. Our analysis of empirical risk minimization applies to well-studied problems such as maximum likelihood estimation and non-linear regression, as well as more complex problems in which the models of interest are stationary processes. To illustrate the latter, we undertake an extended analysis of system identification from quantized trajectories subject to noise, a problem at the intersection of dynamics and statistics.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"1073-1104"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50347112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Representation theoretic patterns in multi-frequency class averaging for three-dimensional cryo-electron microscopy","authors":"Yifeng Fan;Tingran Gao;Zhizhen Zhao","doi":"10.1093/imaiai/iaab012","DOIUrl":"https://doi.org/10.1093/imaiai/iaab012","url":null,"abstract":"We develop in this paper a novel intrinsic classification algorithm—multi-frequency class averaging (MFCA)—for classifying noisy projection images obtained from three-dimensional cryo-electron microscopy by the similarity among their viewing directions. This new algorithm leverages multiple irreducible representations of the unitary group to introduce additional redundancy into the representation of the optimal in-plane rotational alignment, extending and outperforming the existing class averaging algorithm that uses only a single representation. The formal algebraic model and representation theoretic patterns of the proposed MFCA algorithm extend the framework of Hadani and Singer to arbitrary irreducible representations of the unitary group. We conceptually establish the consistency and stability of MFCA by inspecting the spectral properties of a generalized local parallel transport operator through the lens of Wigner \u0000<tex>$D$</tex>\u0000-matrices. We demonstrate the efficacy of the proposed algorithm with numerical experiments.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"723-771"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaab012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50224070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiplier bootstrap for quantile regression: non-asymptotic theory under random design","authors":"Xiaoou Pan;Wen-Xin Zhou","doi":"10.1093/imaiai/iaaa006","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa006","url":null,"abstract":"This paper establishes non-asymptotic concentration bound and Bahadur representation for the quantile regression estimator and its multiplier bootstrap counterpart in the random design setting. The non-asymptotic analysis keeps track of the impact of the parameter dimension \u0000<tex>$d$</tex>\u0000 and sample size \u0000<tex>$n$</tex>\u0000 in the rate of convergence, as well as in normal and bootstrap approximation errors. These results represent a useful complement to the asymptotic results under fixed design and provide theoretical guarantees for the validity of Rademacher multiplier bootstrap in the problems of confidence construction and goodness-of-fit testing. Numerical studies lend strong support to our theory and highlight the effectiveness of Rademacher bootstrap in terms of accuracy, reliability and computational efficiency.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"813-861"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50224072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning a deep convolutional neural network via tensor decomposition","authors":"Samet Oymak;Mahdi Soltanolkotabi","doi":"10.1093/imaiai/iaaa042","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa042","url":null,"abstract":"In this paper, we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed deep tensor decomposition (DeepTD) is based on a low-rank tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is sample efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"1031-1071"},"PeriodicalIF":1.6,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50347111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized score matching for general domains.","authors":"Shiqing Yu, Mathias Drton, Ali Shojaie","doi":"10.1093/imaiai/iaaa041","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa041","url":null,"abstract":"<p><p>Estimation of density functions supported on general domains arises when the data are naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants but as originally proposed is limited to densities on [Formula: see text] and [Formula: see text]. In this paper, we offer a natural generalization of score matching that accommodates densities supported on a very general class of domains. We apply the framework to truncated graphical and pairwise interaction models and provide theoretical guarantees for the resulting estimators. We also generalize a recently proposed method from bounded to unbounded domains and empirically demonstrate the advantages of our method.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"11 2","pages":"739-780"},"PeriodicalIF":1.6,"publicationDate":"2021-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9203079/pdf/iaaa041.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40041673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}