{"title":"低预期秩随机矩阵的条目特征向量分析。","authors":"Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong","doi":"10.1214/19-aos1854","DOIUrl":null,"url":null,"abstract":"<p><p>Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the <i>ℓ</i> <sub>∞</sub> norm: <dispformula> <math> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>≈</mo> <mfrac><mrow><mi>A</mi> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mrow><msubsup><mi>λ</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </mfrac> <mo>,</mo></mrow> </math> </dispformula> where {<i>u</i> <sub><i>k</i></sub> } and <math> <mrow><mrow><mo>{</mo> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>}</mo></mrow> </mrow> </math> are eigenvectors of a random matrix <i>A</i> and its expectation <math><mrow><mi>E</mi> <mi>A</mi></mrow> </math> , respectively. The fact that the approximation is both tight and linear in <i>A</i> facilitates sharp comparisons between <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> . In particular, it allows for comparing the signs of <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> even if <math> <mrow> <msub> <mrow><mrow><mo>‖</mo> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>-</mo> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>‖</mo></mrow> </mrow> <mi>∞</mi></msub> </mrow> </math> is large. The results are further extended to perturbations of eigenspaces, yielding new <i>ℓ</i> <sub>∞</sub>-type bounds for synchronization ( <math> <mrow><msub><mi>ℤ</mi> <mn>2</mn></msub> </mrow> </math> -spiked Wigner model) and noisy matrix completion.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046180/pdf/nihms-1053828.pdf","citationCount":"0","resultStr":"{\"title\":\"ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.\",\"authors\":\"Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong\",\"doi\":\"10.1214/19-aos1854\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the <i>ℓ</i> <sub>∞</sub> norm: <dispformula> <math> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>≈</mo> <mfrac><mrow><mi>A</mi> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mrow><msubsup><mi>λ</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </mfrac> <mo>,</mo></mrow> </math> </dispformula> where {<i>u</i> <sub><i>k</i></sub> } and <math> <mrow><mrow><mo>{</mo> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>}</mo></mrow> </mrow> </math> are eigenvectors of a random matrix <i>A</i> and its expectation <math><mrow><mi>E</mi> <mi>A</mi></mrow> </math> , respectively. The fact that the approximation is both tight and linear in <i>A</i> facilitates sharp comparisons between <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> . In particular, it allows for comparing the signs of <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> even if <math> <mrow> <msub> <mrow><mrow><mo>‖</mo> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>-</mo> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>‖</mo></mrow> </mrow> <mi>∞</mi></msub> </mrow> </math> is large. The results are further extended to perturbations of eigenspaces, yielding new <i>ℓ</i> <sub>∞</sub>-type bounds for synchronization ( <math> <mrow><msub><mi>ℤ</mi> <mn>2</mn></msub> </mrow> </math> -spiked Wigner model) and noisy matrix completion.</p>\",\"PeriodicalId\":8032,\"journal\":{\"name\":\"Annals of Statistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046180/pdf/nihms-1053828.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/19-aos1854\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/7/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/19-aos1854","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/7/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
摘要
通过特征向量扰动分析恢复低秩结构是统计机器学习中的一个常见问题,如因子分析、群落检测、排序、矩阵补全等。虽然对特征向量的经验统计和群体统计之间的平均误差有大量的约束,但很少有结果能严密地进行入口分析,而入口分析对群体检测等一系列问题至关重要。本文研究了一大类期望为低秩的随机矩阵的特征向量入口行为,这有助于解决 Abbe 等人(2014b)的猜想,即在随机块模型中,谱算法无需任何修剪或清理步骤即可实现精确恢复。关键在于ℓ ∞ 规范下特征向量的一阶近似:u k ≈ A u k * λ k *,其中 {u k } 和 { u k * } 分别是随机矩阵 A 的特征向量及其期望 E A。近似值在 A 中既紧密又线性,这一事实有助于对 u k 和 u k * 进行清晰的比较。特别是,即使 ‖ u k - u k * ‖ ∞ 很大,也能比较 u k 和 u k * 的符号。这些结果进一步扩展到特征空间的扰动,产生了同步化(ℤ 2 -spiked Wigner 模型)和噪声矩阵补全的新ℓ ∞ 型边界。
ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.
Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the ℓ∞ norm: where {uk } and are eigenvectors of a random matrix A and its expectation , respectively. The fact that the approximation is both tight and linear in A facilitates sharp comparisons between uk and . In particular, it allows for comparing the signs of uk and even if is large. The results are further extended to perturbations of eigenspaces, yielding new ℓ∞-type bounds for synchronization ( -spiked Wigner model) and noisy matrix completion.
期刊介绍:
The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.