Minimax rates of convergence for high-dimensional regression under ℓq-ball sparsity

Garvesh Raskutti, M. Wainwright, Bin Yu
{"title":"Minimax rates of convergence for high-dimensional regression under ℓq-ball sparsity","authors":"Garvesh Raskutti, M. Wainwright, Bin Yu","doi":"10.1109/ALLERTON.2009.5394804","DOIUrl":null,"url":null,"abstract":"Consider the standard linear regression model y = Xß∗ + w, where y ∊ R<sup>n</sup> is an observation vector, X ∊ R<sup>n×d</sup> is a measurement matrix, ß∗ ∊ R<sup>d</sup> is the unknown regression vector, and w ~ N (0, σ<sup>2</sup>Ι) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of ß∗ in l<inf>2</inf> norm, assuming that β∗ belongs to a weak l<inf>b</inf>-ball B<inf>q</inf>(ñ<inf>q</inf>) for some q ∊ [0,1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared l<inf>2</inf>-norm scales as R<inf>q</inf>(log d ÷ n)<sup>1 −q÷2</sup>. In addition, we provide lower bounds on rates of convergence for general l<inf>p</inf> norm (for all p ∊ [l,+∞], p ≠ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls B<inf>q</inf>(R<inf>q</inf>). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over B<inf>q</inf>(R<inf>q</inf>). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of non-i.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, t<inf>1</inf>-based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in t<inf>2</inf> error, but require stronger regularity conditions on the design than the non-convex optimization algorithm used to determine the minimax upper bounds.","PeriodicalId":440015,"journal":{"name":"2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2009.5394804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

Consider the standard linear regression model y = Xß∗ + w, where y ∊ Rn is an observation vector, X ∊ Rn×d is a measurement matrix, ß∗ ∊ Rd is the unknown regression vector, and w ~ N (0, σ2Ι) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of ß∗ in l2 norm, assuming that β∗ belongs to a weak lb-ball Bqq) for some q ∊ [0,1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared l2-norm scales as Rq(log d ÷ n)1 −q÷2. In addition, we provide lower bounds on rates of convergence for general lp norm (for all p ∊ [l,+∞], p ≠ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls Bq(Rq). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over Bq(Rq). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of non-i.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, t1-based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in t2 error, but require stronger regularity conditions on the design than the non-convex optimization algorithm used to determine the minimax upper bounds.
高维回归在q球稀疏性下的极大极小收敛率
考虑标准线性回归模型y = Xß∗+ w,其中y Rn为观测向量,X Rn×d为测量矩阵,ß∗Rd为未知回归向量,w ~ N (0, σ2Ι)为加性高斯噪声。本文确定了l2范数中β∗估计的极大极小收敛率,假设β∗属于一个弱lb-球Bq(ñq),对于某些q *[0,1]。我们证明,在设计矩阵X上适当的正则性条件下,平方12范数的最小最大误差尺度为Rq(log d ÷ n)1−q÷2。此外,我们给出了一般lp范数(对于所有p [l,+∞],p≠q)收敛速率的下界。我们的下界证明是信息论性质的,基于Fano不等式和球的度量熵Bq(Rq)的结果。通过直接分析Bq(Rq)上的优化算法的解,导出了匹配上界。我们证明了最优算法在X上的条件是高概率满足的。高斯随机矩阵,其违反RIP或其他稀疏特征值条件。对于q = 0,基于t1的方法(Lasso和Dantzig选择器)在t2误差下实现了极小极大最优率,但与用于确定极小极大上界的非凸优化算法相比,在设计上需要更强的正则性条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信