Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs

Q1 Mathematics
A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn
{"title":"Smoothed Variable Sample-Size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs","authors":"A. Jalilzadeh, U. Shanbhag, J. Blanchet, P. Glynn","doi":"10.1287/stsy.2022.0095","DOIUrl":null,"url":null,"abstract":"We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stochastic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/stsy.2022.0095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 8

Abstract

We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is -strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, arising in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau envelope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of [Formula: see text] with an iteration complexity of [Formula: see text] in inexact (outer) gradients of F to achieve an -accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccelerated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇xF(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of [Formula: see text]; (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of [Formula: see text] to obtain an -optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of [Formula: see text], matching the bound provided by Nesterov in 2005 for producing -optimal solutions. Finally, (sVS-APM) and (VS-APM) produce sequences that converge almost surely to a solution of the original problem.
非光滑随机凸规划的光滑变样本量加速逼近方法
考虑函数F的无约束极小化问题,其中F = F + g, F是一个期望值非光滑凸函数或强凸函数,g是一个闭凸固有函数。(I)强凸f。当f在x中为-强凸时,传统的随机亚梯度方案(SSG)通常表现出较差的行为,部分原因是由于噪声的亚梯度和递减的步长。相反,我们在F (F的莫罗包络)上应用了变样本量加速近端方案(VS-APM);我们将这种方案称为(mVS-APM),与(SSG)方案相比,(mVS-APM)利用恒定的步长和越来越精确的梯度。我们考虑两种情况。(a)有界域。在这种情况下,(mVS-APM)在不精确的梯度步骤中显示线性收敛,每个步骤都需要使用内部(prox-SSG)方案。具体来说,(mVS-APM)在[公式:见文]的prox-SSG步骤中实现了最优的oracle复杂度,在F的不精确(外部)梯度中实现了[公式:见文]的迭代复杂度,从而通过增加内部(随机)子梯度步骤的数量来实现均方误差的精确解;(b)无界域。在这种情况下,在子梯度上的状态依赖边界假设下,非加速变量(mVS-APM)是线性收敛的,其中越来越精确的梯度∇xF(x)通过(SSG)格式以越来越高的精度逼近。值得注意的是,(mVS-APM)也显示了最优的oracle复杂性[公式:见文本];(II)凸f。当f仅为凸但平滑时,通过对平滑序列、步长序列和批大小序列的适当选择,smooththed (VS-APM)(或sVS-APM)达到最优的oracle复杂度为[公式:见文],从而得到一个-最优解。我们的结果可以专门用于两个重要的情况:(a)平滑f.由于不再需要平滑,我们观察到(VS-APM)承认最优速率和oracle复杂性,匹配先验结果;f.在非光滑确定性区域,(sVS-APM)简化为光滑加速近端方法(s-APM),它是渐近收敛和最优的,因为它显示出[公式:见文]的复杂性,匹配Nesterov在2005年提供的产生-最优解的界。最后,(sVS-APM)和(VS-APM)产生的序列几乎肯定收敛于原问题的一个解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Stochastic Systems
Stochastic Systems Decision Sciences-Statistics, Probability and Uncertainty
CiteScore
3.70
自引率
0.00%
发文量
18
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信