Stefan Bamberger , Reinhard Heckel , Felix Krahmer
{"title":"Approximating positive homogeneous functions with scale invariant neural networks","authors":"Stefan Bamberger , Reinhard Heckel , Felix Krahmer","doi":"10.1016/j.jat.2025.106177","DOIUrl":null,"url":null,"abstract":"<div><div>We investigate the approximation of positive homogeneous functions, i.e., functions <span><math><mi>f</mi></math></span> satisfying <span><math><mrow><mi>f</mi><mrow><mo>(</mo><mi>λ</mi><mi>x</mi><mo>)</mo></mrow><mo>=</mo><mi>λ</mi><mi>f</mi><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></math></span> for all <span><math><mrow><mi>λ</mi><mo>≥</mo><mn>0</mn></mrow></math></span>, with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. As a key application for this, we analyze to what extent it is possible to solve linear inverse problems with <span><math><mo>ReLu</mo></math></span> networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function for such a problem is positive homogeneous. In a <span><math><mo>ReLu</mo></math></span> network, this condition translates to considering networks without bias terms. For the recovery of sparse vectors from few linear measurements, our results imply that <span><math><mo>ReLu</mo></math></span> networks with two hidden layers allow approximate recovery with arbitrary precision and arbitrary sparsity level <span><math><mi>s</mi></math></span> in a stable way. In contrast, we also show that with only one hidden layer such networks cannot even recover 1-sparse vectors, not even approximately, and regardless of the width of the network. These findings even apply to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.</div></div>","PeriodicalId":54878,"journal":{"name":"Journal of Approximation Theory","volume":"311 ","pages":"Article 106177"},"PeriodicalIF":0.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Approximation Theory","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021904525000358","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
We investigate the approximation of positive homogeneous functions, i.e., functions satisfying for all , with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. As a key application for this, we analyze to what extent it is possible to solve linear inverse problems with networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function for such a problem is positive homogeneous. In a network, this condition translates to considering networks without bias terms. For the recovery of sparse vectors from few linear measurements, our results imply that networks with two hidden layers allow approximate recovery with arbitrary precision and arbitrary sparsity level in a stable way. In contrast, we also show that with only one hidden layer such networks cannot even recover 1-sparse vectors, not even approximately, and regardless of the width of the network. These findings even apply to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.
期刊介绍:
The Journal of Approximation Theory is devoted to advances in pure and applied approximation theory and related areas. These areas include, among others:
• Classical approximation
• Abstract approximation
• Constructive approximation
• Degree of approximation
• Fourier expansions
• Interpolation of operators
• General orthogonal systems
• Interpolation and quadratures
• Multivariate approximation
• Orthogonal polynomials
• Padé approximation
• Rational approximation
• Spline functions of one and several variables
• Approximation by radial basis functions in Euclidean spaces, on spheres, and on more general manifolds
• Special functions with strong connections to classical harmonic analysis, orthogonal polynomial, and approximation theory (as opposed to combinatorics, number theory, representation theory, generating functions, formal theory, and so forth)
• Approximation theoretic aspects of real or complex function theory, function theory, difference or differential equations, function spaces, or harmonic analysis
• Wavelet Theory and its applications in signal and image processing, and in differential equations with special emphasis on connections between wavelet theory and elements of approximation theory (such as approximation orders, Besov and Sobolev spaces, and so forth)
• Gabor (Weyl-Heisenberg) expansions and sampling theory.