{"title":"A note on diffusion limits for stochastic gradient descent","authors":"Alberto Lanconelli, Christopher S.A. Lauria","doi":"10.1016/j.jat.2025.106160","DOIUrl":null,"url":null,"abstract":"<div><div>In the machine learning literature stochastic gradient descent has recently been widely discussed for its purported implicit regularization properties. Much of the theory, that attempts to clarify the role of noise in stochastic gradient algorithms, has approximated stochastic gradient descent by a stochastic differential equation with Gaussian noise. We provide a rigorous theoretical justification for this practice that showcases how the Gaussianity of the noise arises naturally.</div></div>","PeriodicalId":54878,"journal":{"name":"Journal of Approximation Theory","volume":"309 ","pages":"Article 106160"},"PeriodicalIF":0.9000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Approximation Theory","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021904525000188","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
In the machine learning literature stochastic gradient descent has recently been widely discussed for its purported implicit regularization properties. Much of the theory, that attempts to clarify the role of noise in stochastic gradient algorithms, has approximated stochastic gradient descent by a stochastic differential equation with Gaussian noise. We provide a rigorous theoretical justification for this practice that showcases how the Gaussianity of the noise arises naturally.
期刊介绍:
The Journal of Approximation Theory is devoted to advances in pure and applied approximation theory and related areas. These areas include, among others:
• Classical approximation
• Abstract approximation
• Constructive approximation
• Degree of approximation
• Fourier expansions
• Interpolation of operators
• General orthogonal systems
• Interpolation and quadratures
• Multivariate approximation
• Orthogonal polynomials
• Padé approximation
• Rational approximation
• Spline functions of one and several variables
• Approximation by radial basis functions in Euclidean spaces, on spheres, and on more general manifolds
• Special functions with strong connections to classical harmonic analysis, orthogonal polynomial, and approximation theory (as opposed to combinatorics, number theory, representation theory, generating functions, formal theory, and so forth)
• Approximation theoretic aspects of real or complex function theory, function theory, difference or differential equations, function spaces, or harmonic analysis
• Wavelet Theory and its applications in signal and image processing, and in differential equations with special emphasis on connections between wavelet theory and elements of approximation theory (such as approximation orders, Besov and Sobolev spaces, and so forth)
• Gabor (Weyl-Heisenberg) expansions and sampling theory.