Numerical Analysis for Convergence of a Sample-Wise Backpropagation Method for Training Stochastic Neural Networks

IF 2.8 2区 数学 Q1 MATHEMATICS, APPLIED
Richard Archibald, Feng Bao, Yanzhao Cao, Hui Sun
{"title":"Numerical Analysis for Convergence of a Sample-Wise Backpropagation Method for Training Stochastic Neural Networks","authors":"Richard Archibald, Feng Bao, Yanzhao Cao, Hui Sun","doi":"10.1137/22m1523765","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Numerical Analysis, Volume 62, Issue 2, Page 593-621, April 2024. <br/> Abstract. The aim of this paper is to carry out convergence analysis and algorithm implementation of a novel sample-wise backpropagation method for training a class of stochastic neural networks (SNNs). The preliminary discussion on such an SNN framework was first introduced in [Archibald et al., Discrete Contin. Dyn. Syst. Ser. S, 15 (2022), pp. 2807–2835]. The structure of the SNN is formulated as a discretization of a stochastic differential equation (SDE). A stochastic optimal control framework is introduced to model the training procedure, and a sample-wise approximation scheme for the adjoint backward SDE is applied to improve the efficiency of the stochastic optimal control solver, which is equivalent to the backpropagation for training the SNN. The convergence analysis is derived by introducing a novel joint conditional expectation for the gradient process. Under the convexity assumption, our result indicates that the number of SNN training steps should be proportional to the square of the number of layers in the convex optimization case. In the implementation of the sample-based SNN algorithm with the benchmark MNIST dataset, we adopt the convolution neural network (CNN) architecture and demonstrate that our sample-based SNN algorithm is more robust than the conventional CNN.","PeriodicalId":49527,"journal":{"name":"SIAM Journal on Numerical Analysis","volume":"30 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Numerical Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1137/22m1523765","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

SIAM Journal on Numerical Analysis, Volume 62, Issue 2, Page 593-621, April 2024.
Abstract. The aim of this paper is to carry out convergence analysis and algorithm implementation of a novel sample-wise backpropagation method for training a class of stochastic neural networks (SNNs). The preliminary discussion on such an SNN framework was first introduced in [Archibald et al., Discrete Contin. Dyn. Syst. Ser. S, 15 (2022), pp. 2807–2835]. The structure of the SNN is formulated as a discretization of a stochastic differential equation (SDE). A stochastic optimal control framework is introduced to model the training procedure, and a sample-wise approximation scheme for the adjoint backward SDE is applied to improve the efficiency of the stochastic optimal control solver, which is equivalent to the backpropagation for training the SNN. The convergence analysis is derived by introducing a novel joint conditional expectation for the gradient process. Under the convexity assumption, our result indicates that the number of SNN training steps should be proportional to the square of the number of layers in the convex optimization case. In the implementation of the sample-based SNN algorithm with the benchmark MNIST dataset, we adopt the convolution neural network (CNN) architecture and demonstrate that our sample-based SNN algorithm is more robust than the conventional CNN.
用于训练随机神经网络的采样-明智反向传播方法收敛性的数值分析
SIAM 数值分析期刊》第 62 卷第 2 期第 593-621 页,2024 年 4 月。 摘要本文旨在对训练一类随机神经网络(SNN)的新型采样反向传播方法进行收敛性分析和算法实现。关于此类 SNN 框架的初步讨论最早见于 [Archibald 等人,Discrete Contin.Dyn.Syst.S, 15 (2022), pp.]SNN 的结构被表述为随机微分方程 (SDE) 的离散化。引入了一个随机最优控制框架来模拟训练过程,并应用了一种用于邻接后向 SDE 的采样近似方案来提高随机最优控制求解器的效率,该方案等同于用于训练 SNN 的反向传播。通过引入梯度过程的新型联合条件期望,得出了收敛性分析。在凸性假设下,我们的结果表明,在凸优化情况下,SNN 的训练步数应与层数的平方成正比。在使用基准 MNIST 数据集实现基于样本的 SNN 算法时,我们采用了卷积神经网络(CNN)架构,并证明我们的基于样本的 SNN 算法比传统的 CNN 更稳健。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.80
自引率
6.90%
发文量
110
审稿时长
4-8 weeks
期刊介绍: SIAM Journal on Numerical Analysis (SINUM) contains research articles on the development and analysis of numerical methods. Topics include the rigorous study of convergence of algorithms, their accuracy, their stability, and their computational complexity. Also included are results in mathematical analysis that contribute to algorithm analysis, and computational results that demonstrate algorithm behavior and applicability.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信