{"title":"凸约束下的噪声线性逆问题:高维的精确风险渐近性","authors":"Qiyang Han","doi":"10.1214/23-aos2301","DOIUrl":null,"url":null,"abstract":"In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions\",\"authors\":\"Qiyang Han\",\"doi\":\"10.1214/23-aos2301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.\",\"PeriodicalId\":8032,\"journal\":{\"name\":\"Annals of Statistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1214/23-aos2301\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1214/23-aos2301","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions
In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.
期刊介绍:
The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.