{"title":"Revisiting the central limit theorems for the SGD-type methods","authors":"Tiejun Li, Tiannan Xiao, Guoguo Yang","doi":"10.4310/cms.2024.v22.n5.a10","DOIUrl":null,"url":null,"abstract":"We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and $L^p$ bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods as compared to previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.","PeriodicalId":50659,"journal":{"name":"Communications in Mathematical Sciences","volume":"322 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Mathematical Sciences","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.4310/cms.2024.v22.n5.a10","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and $L^p$ bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods as compared to previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.
期刊介绍:
Covers modern applied mathematics in the fields of modeling, applied and stochastic analyses and numerical computations—on problems that arise in physical, biological, engineering, and financial applications. The journal publishes high-quality, original research articles, reviews, and expository papers.