随机梯度法在非凸优化中的自适应性

IF 1.9 Q1 MATHEMATICS, APPLIED
Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan
{"title":"随机梯度法在非凸优化中的自适应性","authors":"Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan","doi":"10.1137/21m1394308","DOIUrl":null,"url":null,"abstract":"Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the \"geometrization\" technique introduced by Lei & Jordan 2016 and the \\texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized \\texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"236 1","pages":"634-648"},"PeriodicalIF":1.9000,"publicationDate":"2020-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization\",\"authors\":\"Samuel Horváth, Lihua Lei, Peter Richtárik, Michael I. Jordan\",\"doi\":\"10.1137/21m1394308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the \\\"geometrization\\\" technique introduced by Lei & Jordan 2016 and the \\\\texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized \\\\texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.\",\"PeriodicalId\":74797,\"journal\":{\"name\":\"SIAM journal on mathematics of data science\",\"volume\":\"236 1\",\"pages\":\"634-648\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2020-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM journal on mathematics of data science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/21m1394308\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM journal on mathematics of data science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/21m1394308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 18

摘要

自适应是现代优化理论中一个重要的但尚未得到充分研究的性质。最先进的理论与当前实践之间的差距是惊人的,因为具有理想理论保证的算法通常涉及不同制度下超参数的完全不同设置,例如步长方案和批大小。尽管有吸引人的理论结果,这种分裂的策略提供了很少的洞察力,如果有的话,从业者选择广泛工作的算法而不调整超参数。在这项工作中,我们将Lei & Jordan 2016引入的“几何化”技术和Nguyen et al., 2017的\texttt{莎拉}算法相结合,提出了用于非凸有限和随机优化的几何化\texttt{莎拉}算法。我们的算法被证明对目标精度的大小和Polyak-Łojasiewicz (PL)常数都有自适应性。此外,它在非PL目标的同时实现了最佳可用收敛率,同时优于现有的PL目标算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization
Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the "geometrization" technique introduced by Lei & Jordan 2016 and the \texttt{SARAH} algorithm of Nguyen et al., 2017, we propose the Geometrized \texttt{SARAH} algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信