具有噪声信标和额外采样的非单调信任区域方法

IF 1.6 2区 数学 Q2 MATHEMATICS, APPLIED
Nataša Krejić, Nataša Krklec Jerinkić, Ángeles Martínez, Mahsa Yousefi
{"title":"具有噪声信标和额外采样的非单调信任区域方法","authors":"Nataša Krejić, Nataša Krklec Jerinkić, Ángeles Martínez, Mahsa Yousefi","doi":"10.1007/s10589-024-00580-w","DOIUrl":null,"url":null,"abstract":"<p>In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies that yield noisy approximations of the finite sum objective function and its gradient. We introduce an adaptive sample size strategy based on inexpensive additional sampling to control the resulting approximation error. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.</p>","PeriodicalId":55227,"journal":{"name":"Computational Optimization and Applications","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A non-monotone trust-region method with noisy oracles and additional sampling\",\"authors\":\"Nataša Krejić, Nataša Krklec Jerinkić, Ángeles Martínez, Mahsa Yousefi\",\"doi\":\"10.1007/s10589-024-00580-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies that yield noisy approximations of the finite sum objective function and its gradient. We introduce an adaptive sample size strategy based on inexpensive additional sampling to control the resulting approximation error. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.</p>\",\"PeriodicalId\":55227,\"journal\":{\"name\":\"Computational Optimization and Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Optimization and Applications\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10589-024-00580-w\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Optimization and Applications","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10589-024-00580-w","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

在这项工作中,我们在非单调信任区域方法的框架内引入了一种新型随机二阶方法,用于解决深度神经网络训练中出现的无约束、非线性和非凸优化问题。所提出的算法采用了子采样策略,可以得到有限和目标函数及其梯度的噪声近似值。我们引入了一种基于廉价额外采样的自适应样本大小策略,以控制由此产生的近似误差。根据算法的估计进度,这可以产生从小批量到全样本函数的样本大小方案。我们提供了所有可能方案的收敛性分析,并表明在信任区域框架的标准假设条件下,所提出的方法几乎可以确保收敛性。我们报告的数值实验表明,在针对图像分类和回归任务的深度神经网络训练中,所提出的算法优于其最先进的同类算法,同时所需的梯度评估次数也大大减少。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

A non-monotone trust-region method with noisy oracles and additional sampling

A non-monotone trust-region method with noisy oracles and additional sampling

In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies that yield noisy approximations of the finite sum objective function and its gradient. We introduce an adaptive sample size strategy based on inexpensive additional sampling to control the resulting approximation error. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.70
自引率
9.10%
发文量
91
审稿时长
10 months
期刊介绍: Computational Optimization and Applications is a peer reviewed journal that is committed to timely publication of research and tutorial papers on the analysis and development of computational algorithms and modeling technology for optimization. Algorithms either for general classes of optimization problems or for more specific applied problems are of interest. Stochastic algorithms as well as deterministic algorithms will be considered. Papers that can provide both theoretical analysis, along with carefully designed computational experiments, are particularly welcome. Topics of interest include, but are not limited to the following: Large Scale Optimization, Unconstrained Optimization, Linear Programming, Quadratic Programming Complementarity Problems, and Variational Inequalities, Constrained Optimization, Nondifferentiable Optimization, Integer Programming, Combinatorial Optimization, Stochastic Optimization, Multiobjective Optimization, Network Optimization, Complexity Theory, Approximations and Error Analysis, Parametric Programming and Sensitivity Analysis, Parallel Computing, Distributed Computing, and Vector Processing, Software, Benchmarks, Numerical Experimentation and Comparisons, Modelling Languages and Systems for Optimization, Automatic Differentiation, Applications in Engineering, Finance, Optimal Control, Optimal Design, Operations Research, Transportation, Economics, Communications, Manufacturing, and Management Science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信