Approximating positive homogeneous functions with scale invariant neural networks

IF 0.9 3区数学 Q2 MATHEMATICS

Journal of Approximation Theory Pub Date : 2025-04-23 DOI:10.1016/j.jat.2025.106177

Stefan Bamberger , Reinhard Heckel , Felix Krahmer

{"title":"Approximating positive homogeneous functions with scale invariant neural networks","authors":"Stefan Bamberger , Reinhard Heckel , Felix Krahmer","doi":"10.1016/j.jat.2025.106177","DOIUrl":null,"url":null,"abstract":"<div><div>We investigate the approximation of positive homogeneous functions, i.e., functions <span><math><mi>f</mi></math></span> satisfying <span><math><mrow><mi>f</mi><mrow><mo>(</mo><mi>λ</mi><mi>x</mi><mo>)</mo></mrow><mo>=</mo><mi>λ</mi><mi>f</mi><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></math></span> for all <span><math><mrow><mi>λ</mi><mo>≥</mo><mn>0</mn></mrow></math></span>, with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. As a key application for this, we analyze to what extent it is possible to solve linear inverse problems with <span><math><mo>ReLu</mo></math></span> networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function for such a problem is positive homogeneous. In a <span><math><mo>ReLu</mo></math></span> network, this condition translates to considering networks without bias terms. For the recovery of sparse vectors from few linear measurements, our results imply that <span><math><mo>ReLu</mo></math></span> networks with two hidden layers allow approximate recovery with arbitrary precision and arbitrary sparsity level <span><math><mi>s</mi></math></span> in a stable way. In contrast, we also show that with only one hidden layer such networks cannot even recover 1-sparse vectors, not even approximately, and regardless of the width of the network. These findings even apply to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.</div></div>","PeriodicalId":54878,"journal":{"name":"Journal of Approximation Theory","volume":"311 ","pages":"Article 106177"},"PeriodicalIF":0.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Approximation Theory","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021904525000358","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

We investigate the approximation of positive homogeneous functions, i.e., functions

f

satisfying

f (λ x) = λ f (x)

for all

λ \geq 0

, with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. As a key application for this, we analyze to what extent it is possible to solve linear inverse problems with

ReLu

networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function for such a problem is positive homogeneous. In a

ReLu

network, this condition translates to considering networks without bias terms. For the recovery of sparse vectors from few linear measurements, our results imply that

ReLu

networks with two hidden layers allow approximate recovery with arbitrary precision and arbitrary sparsity level

s

in a stable way. In contrast, we also show that with only one hidden layer such networks cannot even recover 1-sparse vectors, not even approximately, and regardless of the width of the network. These findings even apply to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.

查看原文本刊更多论文

用尺度不变神经网络逼近正齐次函数

利用神经网络研究了正齐次函数的逼近，即对于所有λ≥0，函数f满足f(λx)=λf(x)。扩展先前的工作，我们建立了新的结果，解释了在哪些条件下这些函数可以用神经网络近似。作为该方法的一个关键应用，我们分析了ReLu网络在多大程度上可以解决线性逆问题。由于线性引起的标度不变性，该问题的最优重构函数是正齐次的。在ReLu网络中，这个条件转化为考虑没有偏置项的网络。对于从少量线性测量中恢复稀疏向量，我们的结果表明，具有两个隐藏层的ReLu网络可以以稳定的方式以任意精度和任意稀疏度水平s近似恢复。相反，我们还表明，只有一个隐藏层，这样的网络甚至不能恢复1-稀疏向量，甚至不能近似地恢复，并且与网络的宽度无关。这些发现甚至适用于更广泛的恢复问题，包括低秩矩阵恢复和相位恢复。我们的结果还揭示了之前的研究之间的矛盾，表明反问题的神经网络通常具有非常大的Lipschitz常数，但对于对抗噪声仍然表现得很好。也就是说，我们的表达式结果中的误差界限包括一个小的常数项和一个在噪声水平上是线性的项的组合，这表明鲁棒性问题可能只发生在非常小的噪声水平上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Approximation Theory 数学-数学

CiteScore

1.90

自引率

11.10%

发文量

审稿时长

6-12 weeks

期刊介绍： The Journal of Approximation Theory is devoted to advances in pure and applied approximation theory and related areas. These areas include, among others: • Classical approximation • Abstract approximation • Constructive approximation • Degree of approximation • Fourier expansions • Interpolation of operators • General orthogonal systems • Interpolation and quadratures • Multivariate approximation • Orthogonal polynomials • Padé approximation • Rational approximation • Spline functions of one and several variables • Approximation by radial basis functions in Euclidean spaces, on spheres, and on more general manifolds • Special functions with strong connections to classical harmonic analysis, orthogonal polynomial, and approximation theory (as opposed to combinatorics, number theory, representation theory, generating functions, formal theory, and so forth) • Approximation theoretic aspects of real or complex function theory, function theory, difference or differential equations, function spaces, or harmonic analysis • Wavelet Theory and its applications in signal and image processing, and in differential equations with special emphasis on connections between wavelet theory and elements of approximation theory (such as approximation orders, Besov and Sobolev spaces, and so forth) • Gabor (Weyl-Heisenberg) expansions and sampling theory.