凸约束下的噪声线性逆问题:高维的精确风险渐近性

IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY
Qiyang Han
{"title":"凸约束下的噪声线性逆问题:高维的精确风险渐近性","authors":"Qiyang Han","doi":"10.1214/23-aos2301","DOIUrl":null,"url":null,"abstract":"In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions\",\"authors\":\"Qiyang Han\",\"doi\":\"10.1214/23-aos2301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.\",\"PeriodicalId\":8032,\"journal\":{\"name\":\"Annals of Statistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1214/23-aos2301\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1214/23-aos2301","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3

摘要

在噪声水平σ>0的标准高斯线性测量模型Y=Xμ0+ξ∈Rm中,考虑了在凸约束μ0∈K下未知信号μ0的估计问题,其中K是Rn中的一个闭凸集。我们证明了自然凸约束最小二乘估计(LSE) μ - (σ)的风险可以通过不同噪声水平下相应高斯序列模型中的凸约束LSE μ - Kseq的风险在高维极限下精确表征。在形式上,我们证明了‖μ (σ)−μ‖2/(nrn2)→1的概率,其中rn2 >0求解不动点方程E‖μ Kseq((rn2+σ2)/(m/n))−μ‖2=nrn2。对于从常阶一直到基本参数率的最大区间的风险rn2,只要满足μ - (σ)的某些必要的非简并性条件,这种表征(一致地)成立。精确的风险表征揭示了在信号恢复的样本复杂度方面,无噪声(或低噪声极限)和有噪声线性逆问题之间的根本区别。等压回归问题给出了一个具体的例子:一般单调信号在无噪声条件下的精确恢复需要m比n1/3个样本,而在有噪声条件下的一致信号恢复只需要m比logn个样本。当μ - Kseq的低噪声和高噪声风险行为显著不同时,就会出现这种差异。在统计语言中,当μ - Kseq以比一般信号更慢的“最坏情况速率”更快的“适应速率”估计0时,就会发生这种情况。另外,还列举了非负最小二乘法和广义Lasso(约束形式)等例子,以证明该理论在不同类型问题中的具体适用性。该证明依赖于关于估计误差、对数似然比检验统计量和与μ - Kseq相关的自由度的新分析和概率结果的集合,这些结果被视为由噪声水平索引的随机过程。这些结果本身具有独立的意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions
In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Annals of Statistics
Annals of Statistics 数学-统计学与概率论
CiteScore
9.30
自引率
8.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信