方差缩减方法中Barzilai-Borwein步长的改进

IF 1.7 2区 数学 Q2 MATHEMATICS, APPLIED
Hai Liu, Yan Liu, Tiande Guo, Congying Han
{"title":"方差缩减方法中Barzilai-Borwein步长的改进","authors":"Hai Liu,&nbsp;Yan Liu,&nbsp;Tiande Guo,&nbsp;Congying Han","doi":"10.1007/s00245-025-10316-9","DOIUrl":null,"url":null,"abstract":"<div><p>We propose several modifications of the Barzilai–Borwein (BB) step size in the variance reduction (VR) methods for finite-sum optimization problems. Our first approach relies on a scalar function, which we call the TaiL Function (TLF). The TLF maps the computed BB step size to some positive real number, which will be used as the step size instead. The computational overhead is almost negligible and the functional forms of TLFs in this work don’t involve any problem-dependent parameters. In the strongly convex setting, due to the undesirable appearance of the condition number <span>\\(\\kappa \\)</span> in the linear convergence rate, the IFO complexity of VR methods with BB step size has the form <span>\\(\\mathcal {O}((n+\\kappa ^a)\\kappa \\log (1/\\epsilon ))\\)</span>, <span>\\(a\\in \\mathbb {R}_{+}\\)</span>. With the utilization of the TLF, the aforementioned complexity is improved to <span>\\(\\mathcal {O}((n+\\kappa ^{\\tilde{a}})\\log (1/\\epsilon ))\\)</span>, <span>\\(\\tilde{a}\\in \\mathbb {R}_{+}, \\tilde{a}&lt;a\\)</span>. In the non-convex setting, we improve <span>\\(\\mathcal {O}(n+n\\epsilon ^{-1})\\)</span> of SVRG-SBB to <span>\\(\\mathcal {O}(n+n^{\\beta }\\epsilon ^{-1})\\)</span>, where <span>\\(\\beta \\in \\mathbb {R}_{+}\\)</span> can take any value in (2/3, 1). Specifically, the constant step size regime is recovered by taking the TLF as a constant function, whose function value relies on problem-dependent parameters. As a counterpart of the constant step size regime, we also propose a BB-based vibration technique to set step sizes for VR methods, leading to methods with novel one-parameter step sizes. These methods have the same complexities compared to their constant step size versions. Meanwhile, they are more robust w.r.t. the sole step size parameter empirically. Moreover, a novel analysis is proposed for SARAH-I-type methods in the strongly convex setting. Numerical tests corroborate the proposed methods.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"92 2","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods\",\"authors\":\"Hai Liu,&nbsp;Yan Liu,&nbsp;Tiande Guo,&nbsp;Congying Han\",\"doi\":\"10.1007/s00245-025-10316-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We propose several modifications of the Barzilai–Borwein (BB) step size in the variance reduction (VR) methods for finite-sum optimization problems. Our first approach relies on a scalar function, which we call the TaiL Function (TLF). The TLF maps the computed BB step size to some positive real number, which will be used as the step size instead. The computational overhead is almost negligible and the functional forms of TLFs in this work don’t involve any problem-dependent parameters. In the strongly convex setting, due to the undesirable appearance of the condition number <span>\\\\(\\\\kappa \\\\)</span> in the linear convergence rate, the IFO complexity of VR methods with BB step size has the form <span>\\\\(\\\\mathcal {O}((n+\\\\kappa ^a)\\\\kappa \\\\log (1/\\\\epsilon ))\\\\)</span>, <span>\\\\(a\\\\in \\\\mathbb {R}_{+}\\\\)</span>. With the utilization of the TLF, the aforementioned complexity is improved to <span>\\\\(\\\\mathcal {O}((n+\\\\kappa ^{\\\\tilde{a}})\\\\log (1/\\\\epsilon ))\\\\)</span>, <span>\\\\(\\\\tilde{a}\\\\in \\\\mathbb {R}_{+}, \\\\tilde{a}&lt;a\\\\)</span>. In the non-convex setting, we improve <span>\\\\(\\\\mathcal {O}(n+n\\\\epsilon ^{-1})\\\\)</span> of SVRG-SBB to <span>\\\\(\\\\mathcal {O}(n+n^{\\\\beta }\\\\epsilon ^{-1})\\\\)</span>, where <span>\\\\(\\\\beta \\\\in \\\\mathbb {R}_{+}\\\\)</span> can take any value in (2/3, 1). Specifically, the constant step size regime is recovered by taking the TLF as a constant function, whose function value relies on problem-dependent parameters. As a counterpart of the constant step size regime, we also propose a BB-based vibration technique to set step sizes for VR methods, leading to methods with novel one-parameter step sizes. These methods have the same complexities compared to their constant step size versions. Meanwhile, they are more robust w.r.t. the sole step size parameter empirically. Moreover, a novel analysis is proposed for SARAH-I-type methods in the strongly convex setting. Numerical tests corroborate the proposed methods.</p></div>\",\"PeriodicalId\":55566,\"journal\":{\"name\":\"Applied Mathematics and Optimization\",\"volume\":\"92 2\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Mathematics and Optimization\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s00245-025-10316-9\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematics and Optimization","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s00245-025-10316-9","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

针对有限和优化问题,我们提出了对方差缩减(VR)方法中Barzilai-Borwein (BB)步长的几种修正。我们的第一种方法依赖于一个标量函数,我们称之为TaiL function (TLF)。TLF将计算出的BB步长映射到一个正实数,该实数将用作步长。计算开销几乎可以忽略不计,并且本工作中tlf的函数形式不涉及任何与问题相关的参数。在强凸设置下,由于线性收敛速率中条件数\(\kappa \)的不良出现,步长为BB的VR方法的IFO复杂度为\(\mathcal {O}((n+\kappa ^a)\kappa \log (1/\epsilon ))\), \(a\in \mathbb {R}_{+}\)。随着TLF的使用,前面提到的复杂性提高到了\(\mathcal {O}((n+\kappa ^{\tilde{a}})\log (1/\epsilon ))\), \(\tilde{a}\in \mathbb {R}_{+}, \tilde{a}<a\)。在非凸设置下,我们将SVRG-SBB的\(\mathcal {O}(n+n\epsilon ^{-1})\)改进为\(\mathcal {O}(n+n^{\beta }\epsilon ^{-1})\),其中\(\beta \in \mathbb {R}_{+}\)可以取(2/ 3,1)中的任意值。具体而言,将TLF作为一个常数函数,其函数值依赖于与问题相关的参数,从而恢复恒定步长范围。作为恒定步长机制的对应,我们还提出了一种基于bb的振动技术来设置VR方法的步长,从而导致具有新颖的单参数步长方法。与步长不变的版本相比,这些方法具有相同的复杂性。同时,对于单一步长参数,它们具有更强的鲁棒性。此外,本文还对强凸环境下的sarah - i型方法提出了一种新的分析方法。数值试验证实了所提出的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods

On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods

We propose several modifications of the Barzilai–Borwein (BB) step size in the variance reduction (VR) methods for finite-sum optimization problems. Our first approach relies on a scalar function, which we call the TaiL Function (TLF). The TLF maps the computed BB step size to some positive real number, which will be used as the step size instead. The computational overhead is almost negligible and the functional forms of TLFs in this work don’t involve any problem-dependent parameters. In the strongly convex setting, due to the undesirable appearance of the condition number \(\kappa \) in the linear convergence rate, the IFO complexity of VR methods with BB step size has the form \(\mathcal {O}((n+\kappa ^a)\kappa \log (1/\epsilon ))\), \(a\in \mathbb {R}_{+}\). With the utilization of the TLF, the aforementioned complexity is improved to \(\mathcal {O}((n+\kappa ^{\tilde{a}})\log (1/\epsilon ))\), \(\tilde{a}\in \mathbb {R}_{+}, \tilde{a}<a\). In the non-convex setting, we improve \(\mathcal {O}(n+n\epsilon ^{-1})\) of SVRG-SBB to \(\mathcal {O}(n+n^{\beta }\epsilon ^{-1})\), where \(\beta \in \mathbb {R}_{+}\) can take any value in (2/3, 1). Specifically, the constant step size regime is recovered by taking the TLF as a constant function, whose function value relies on problem-dependent parameters. As a counterpart of the constant step size regime, we also propose a BB-based vibration technique to set step sizes for VR methods, leading to methods with novel one-parameter step sizes. These methods have the same complexities compared to their constant step size versions. Meanwhile, they are more robust w.r.t. the sole step size parameter empirically. Moreover, a novel analysis is proposed for SARAH-I-type methods in the strongly convex setting. Numerical tests corroborate the proposed methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.30
自引率
5.60%
发文量
103
审稿时长
>12 weeks
期刊介绍: The Applied Mathematics and Optimization Journal covers a broad range of mathematical methods in particular those that bridge with optimization and have some connection with applications. Core topics include calculus of variations, partial differential equations, stochastic control, optimization of deterministic or stochastic systems in discrete or continuous time, homogenization, control theory, mean field games, dynamic games and optimal transport. Algorithmic, data analytic, machine learning and numerical methods which support the modeling and analysis of optimization problems are encouraged. Of great interest are papers which show some novel idea in either the theory or model which include some connection with potential applications in science and engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信