Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY
Annals of Statistics Pub Date : 2018-08-01 Epub Date: 2018-06-27 DOI:10.1214/17-AOS1601
David L Donoho, Matan Gavish, Iain M Johnstone
{"title":"Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.","authors":"David L Donoho,&nbsp;Matan Gavish,&nbsp;Iain M Johnstone","doi":"10.1214/17-AOS1601","DOIUrl":null,"url":null,"abstract":"<p><p>We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker <i>η</i> that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker <i>η</i>* dominating all other shrinkers. The shape of the optimal shrinker is determined by the choice of loss function and, crucially, by inconsistency of both eigenvalues <i>and</i> eigenvectors of the sample covariance matrix. Details of these phenomena and closed form formulas for the optimal eigenvalue shrinkers are worked out for a menagerie of 26 loss functions for covariance estimation found in the literature, including the Stein, Entropy, Divergence, Fréchet, Bhattacharya/Matusita, Frobenius Norm, Operator Norm, Nuclear Norm and Condition Number losses.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 4","pages":"1742-1778"},"PeriodicalIF":3.2000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1601","citationCount":"181","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/17-AOS1601","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2018/6/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 181

Abstract

We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker η that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker η* dominating all other shrinkers. The shape of the optimal shrinker is determined by the choice of loss function and, crucially, by inconsistency of both eigenvalues and eigenvectors of the sample covariance matrix. Details of these phenomena and closed form formulas for the optimal eigenvalue shrinkers are worked out for a menagerie of 26 loss functions for covariance estimation found in the literature, including the Stein, Entropy, Divergence, Fréchet, Bhattacharya/Matusita, Frobenius Norm, Operator Norm, Nuclear Norm and Condition Number losses.

Abstract Image

尖峰协方差模型中特征值的最优收缩。
我们证明了在一个常见的高维协方差模型中,损失函数的选择对最优估计有着深远的影响。在一个基于Spiked协方差模型和使用正交不变估计量的渐近框架中,我们证明了总体协方差矩阵的最优估计可以归结为设计一个对样本特征值起元素作用的最优收缩器η。事实上,对于每个损失函数,都对应着一个唯一的可容许特征值收缩因子η*,它支配着所有其他收缩因子。最优收缩器的形状由损失函数的选择决定,关键是由样本协方差矩阵的特征值和特征向量的不一致性决定。对于文献中发现的26个协方差估计损失函数,包括Stein、熵、散度、Fréchet、Bhattacharya/Matusita、Frobenius范数、算子范数、核范数和条件数损失,给出了这些现象的细节和最优特征值收缩器的闭式公式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of Statistics
Annals of Statistics 数学-统计学与概率论
CiteScore
9.30
自引率
8.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信