{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">An <ns0:math> <ns0:mrow><ns0:msub><ns0:mi>l</ns0:mi> <ns0:mi>∞</ns0:mi></ns0:msub> </ns0:mrow> </ns0:math> Eigenvector Perturbation Bound and Its Application to Robust Covariance Estimation.","authors":"Jianqing Fan, Weichen Wang, Yiqiao Zhong","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>In statistics and machine learning, we are interested in the eigenvectors (or singular vectors) of certain matrices (e.g. covariance matrices, data matrices, etc). However, those matrices are usually perturbed by noises or statistical errors, either from random sampling or structural patterns. The Davis-Kahan sin <i>θ</i> theorem is often used to bound the difference between the eigenvectors of a matrix A and those of a perturbed matrix <math> <mrow><mover><mi>A</mi> <mo>˜</mo></mover> <mo>=</mo> <mi>A</mi> <mo>+</mo> <mi>E</mi></mrow> </math> , in terms of <math> <mrow><msub><mi>l</mi> <mn>2</mn></msub> </mrow> </math> norm. In this paper, we prove that when <i>A</i> is a low-rank and incoherent matrix, the <math> <mrow><msub><mi>l</mi> <mi>∞</mi></msub> </mrow> </math> norm perturbation bound of singular vectors (or eigenvectors in the symmetric case) is smaller by a factor of <math> <mrow> <msqrt> <mrow><msub><mi>d</mi> <mn>1</mn></msub> </mrow> </msqrt> </mrow> </math> or <math> <mrow> <msqrt> <mrow><msub><mi>d</mi> <mn>2</mn></msub> </mrow> </msqrt> </mrow> </math> for left and right vectors, where <i>d</i> <sub>1</sub> and <i>d</i> <sub>2</sub> are the matrix dimensions. The power of this new perturbation result is shown in robust covariance estimation, particularly when random variables have heavy tails. There, we propose new robust covariance estimators and establish their asymptotic properties using the newly developed perturbation bound. Our theoretical results are verified through extensive numerical experiments.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6867801/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Machine Learning Research","FirstCategoryId":"94","ListUrlMain":"","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In statistics and machine learning, we are interested in the eigenvectors (or singular vectors) of certain matrices (e.g. covariance matrices, data matrices, etc). However, those matrices are usually perturbed by noises or statistical errors, either from random sampling or structural patterns. The Davis-Kahan sin θ theorem is often used to bound the difference between the eigenvectors of a matrix A and those of a perturbed matrix , in terms of norm. In this paper, we prove that when A is a low-rank and incoherent matrix, the norm perturbation bound of singular vectors (or eigenvectors in the symmetric case) is smaller by a factor of or for left and right vectors, where d1 and d2 are the matrix dimensions. The power of this new perturbation result is shown in robust covariance estimation, particularly when random variables have heavy tails. There, we propose new robust covariance estimators and establish their asymptotic properties using the newly developed perturbation bound. Our theoretical results are verified through extensive numerical experiments.
期刊介绍:
The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.
JMLR has a commitment to rigorous yet rapid reviewing.
JMLR seeks previously unpublished papers on machine learning that contain:
new principled algorithms with sound empirical validation, and with justification of theoretical, psychological, or biological nature;
experimental and/or theoretical studies yielding new insight into the design and behavior of learning in intelligent systems;
accounts of applications of existing techniques that shed light on the strengths and weaknesses of the methods;
formalization of new learning tasks (e.g., in the context of new applications) and of methods for assessing performance on those tasks;
development of new analytical frameworks that advance theoretical studies of practical learning methods;
computational models of data from natural learning systems at the behavioral or neural level; or extremely well-written surveys of existing work.