{"title":"On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods","authors":"Hai Liu, Yan Liu, Tiande Guo, Congying Han","doi":"10.1007/s00245-025-10316-9","DOIUrl":null,"url":null,"abstract":"<div><p>We propose several modifications of the Barzilai–Borwein (BB) step size in the variance reduction (VR) methods for finite-sum optimization problems. Our first approach relies on a scalar function, which we call the TaiL Function (TLF). The TLF maps the computed BB step size to some positive real number, which will be used as the step size instead. The computational overhead is almost negligible and the functional forms of TLFs in this work don’t involve any problem-dependent parameters. In the strongly convex setting, due to the undesirable appearance of the condition number <span>\\(\\kappa \\)</span> in the linear convergence rate, the IFO complexity of VR methods with BB step size has the form <span>\\(\\mathcal {O}((n+\\kappa ^a)\\kappa \\log (1/\\epsilon ))\\)</span>, <span>\\(a\\in \\mathbb {R}_{+}\\)</span>. With the utilization of the TLF, the aforementioned complexity is improved to <span>\\(\\mathcal {O}((n+\\kappa ^{\\tilde{a}})\\log (1/\\epsilon ))\\)</span>, <span>\\(\\tilde{a}\\in \\mathbb {R}_{+}, \\tilde{a}<a\\)</span>. In the non-convex setting, we improve <span>\\(\\mathcal {O}(n+n\\epsilon ^{-1})\\)</span> of SVRG-SBB to <span>\\(\\mathcal {O}(n+n^{\\beta }\\epsilon ^{-1})\\)</span>, where <span>\\(\\beta \\in \\mathbb {R}_{+}\\)</span> can take any value in (2/3, 1). Specifically, the constant step size regime is recovered by taking the TLF as a constant function, whose function value relies on problem-dependent parameters. As a counterpart of the constant step size regime, we also propose a BB-based vibration technique to set step sizes for VR methods, leading to methods with novel one-parameter step sizes. These methods have the same complexities compared to their constant step size versions. Meanwhile, they are more robust w.r.t. the sole step size parameter empirically. Moreover, a novel analysis is proposed for SARAH-I-type methods in the strongly convex setting. Numerical tests corroborate the proposed methods.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"92 2","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematics and Optimization","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s00245-025-10316-9","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
We propose several modifications of the Barzilai–Borwein (BB) step size in the variance reduction (VR) methods for finite-sum optimization problems. Our first approach relies on a scalar function, which we call the TaiL Function (TLF). The TLF maps the computed BB step size to some positive real number, which will be used as the step size instead. The computational overhead is almost negligible and the functional forms of TLFs in this work don’t involve any problem-dependent parameters. In the strongly convex setting, due to the undesirable appearance of the condition number \(\kappa \) in the linear convergence rate, the IFO complexity of VR methods with BB step size has the form \(\mathcal {O}((n+\kappa ^a)\kappa \log (1/\epsilon ))\), \(a\in \mathbb {R}_{+}\). With the utilization of the TLF, the aforementioned complexity is improved to \(\mathcal {O}((n+\kappa ^{\tilde{a}})\log (1/\epsilon ))\), \(\tilde{a}\in \mathbb {R}_{+}, \tilde{a}<a\). In the non-convex setting, we improve \(\mathcal {O}(n+n\epsilon ^{-1})\) of SVRG-SBB to \(\mathcal {O}(n+n^{\beta }\epsilon ^{-1})\), where \(\beta \in \mathbb {R}_{+}\) can take any value in (2/3, 1). Specifically, the constant step size regime is recovered by taking the TLF as a constant function, whose function value relies on problem-dependent parameters. As a counterpart of the constant step size regime, we also propose a BB-based vibration technique to set step sizes for VR methods, leading to methods with novel one-parameter step sizes. These methods have the same complexities compared to their constant step size versions. Meanwhile, they are more robust w.r.t. the sole step size parameter empirically. Moreover, a novel analysis is proposed for SARAH-I-type methods in the strongly convex setting. Numerical tests corroborate the proposed methods.
期刊介绍:
The Applied Mathematics and Optimization Journal covers a broad range of mathematical methods in particular those that bridge with optimization and have some connection with applications. Core topics include calculus of variations, partial differential equations, stochastic control, optimization of deterministic or stochastic systems in discrete or continuous time, homogenization, control theory, mean field games, dynamic games and optimal transport. Algorithmic, data analytic, machine learning and numerical methods which support the modeling and analysis of optimization problems are encouraged. Of great interest are papers which show some novel idea in either the theory or model which include some connection with potential applications in science and engineering.