基于上鞅的随机信赖域方法的收敛速度分析

INFORMS journal on optimization Pub Date : 2016-09-23 DOI:10.1287/IJOO.2019.0016

J. Blanchet, C. Cartis, M. Menickelly, K. Scheinberg

{"title":"基于上鞅的随机信赖域方法的收敛速度分析","authors":"J. Blanchet, C. Cartis, M. Menickelly, K. Scheinberg","doi":"10.1287/IJOO.2019.0016","DOIUrl":null,"url":null,"abstract":"We propose a novel framework for analyzing convergence rates of stochastic optimization algorithms with adaptive step sizes. This framework is based on analyzing properties of an underlying generic stochastic process, in particular by deriving a bound on the expected stopping time of this process. We utilize this framework to analyze the bounds on expected global convergence rates of a stochastic variant of a traditional trust region method, introduced in \\cite{ChenMenickellyScheinberg2014}. While traditional trust region methods rely on exact computations of the gradient, Hessian and values of the objective function, this method assumes that these values are available up to some dynamically adjusted accuracy. Moreover, this accuracy is assumed to hold only with some sufficiently large, but fixed, probability, without any additional restrictions on the variance of the errors. This setting applies, for example, to standard stochastic optimization and machine learning formulations. Improving upon the analysis in \\cite{ChenMenickellyScheinberg2014}, we show that the stochastic process defined by the algorithm satisfies the assumptions of our proposed general framework, with the stopping time defined as reaching accuracy $\\|\\nabla f(x)\\|\\leq \\epsilon$. The resulting bound for this stopping time is $O(\\epsilon^{-2})$, under the assumption of sufficiently accurate stochastic gradient, and is the first global complexity bound for a stochastic trust-region method. Finally, we apply the same framework to derive second order complexity bound under some additional assumptions.","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1287/IJOO.2019.0016","citationCount":"91","resultStr":"{\"title\":\"Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales\",\"authors\":\"J. Blanchet, C. Cartis, M. Menickelly, K. Scheinberg\",\"doi\":\"10.1287/IJOO.2019.0016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a novel framework for analyzing convergence rates of stochastic optimization algorithms with adaptive step sizes. This framework is based on analyzing properties of an underlying generic stochastic process, in particular by deriving a bound on the expected stopping time of this process. We utilize this framework to analyze the bounds on expected global convergence rates of a stochastic variant of a traditional trust region method, introduced in \\\\cite{ChenMenickellyScheinberg2014}. While traditional trust region methods rely on exact computations of the gradient, Hessian and values of the objective function, this method assumes that these values are available up to some dynamically adjusted accuracy. Moreover, this accuracy is assumed to hold only with some sufficiently large, but fixed, probability, without any additional restrictions on the variance of the errors. This setting applies, for example, to standard stochastic optimization and machine learning formulations. Improving upon the analysis in \\\\cite{ChenMenickellyScheinberg2014}, we show that the stochastic process defined by the algorithm satisfies the assumptions of our proposed general framework, with the stopping time defined as reaching accuracy $\\\\|\\\\nabla f(x)\\\\|\\\\leq \\\\epsilon$. The resulting bound for this stopping time is $O(\\\\epsilon^{-2})$, under the assumption of sufficiently accurate stochastic gradient, and is the first global complexity bound for a stochastic trust-region method. Finally, we apply the same framework to derive second order complexity bound under some additional assumptions.\",\"PeriodicalId\":73382,\"journal\":{\"name\":\"INFORMS journal on optimization\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1287/IJOO.2019.0016\",\"citationCount\":\"91\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"INFORMS journal on optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1287/IJOO.2019.0016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"INFORMS journal on optimization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/IJOO.2019.0016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 91

摘要

我们提出了一个新的框架来分析具有自适应步长的随机优化算法的收敛速度。这个框架是基于分析一个潜在的一般随机过程的性质，特别是通过推导该过程的期望停止时间的界限。我们利用这个框架来分析在\cite{ChenMenickellyScheinberg2014}中介绍的传统信赖域方法的随机变体的期望全局收敛率的界。传统的信赖域方法依赖于梯度、Hessian和目标函数值的精确计算，而该方法假设这些值在一定的动态调整精度范围内是可用的。此外，假定这种精度只有在某些足够大但固定的概率下才成立，对误差的方差没有任何额外的限制。例如，这个设置适用于标准的随机优化和机器学习公式。在\cite{ChenMenickellyScheinberg2014}分析的基础上，我们证明了算法定义的随机过程满足我们提出的一般框架的假设，其停止时间定义为达到精度$\|\nabla f(x)\|\leq \epsilon$。在足够精确的随机梯度假设下，该停止时间的结果界为$O(\epsilon^{-2})$，是随机信赖域方法的第一个全局复杂度界。最后，我们应用相同的框架，在一些附加的假设下推导二阶复杂度界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales

We propose a novel framework for analyzing convergence rates of stochastic optimization algorithms with adaptive step sizes. This framework is based on analyzing properties of an underlying generic stochastic process, in particular by deriving a bound on the expected stopping time of this process. We utilize this framework to analyze the bounds on expected global convergence rates of a stochastic variant of a traditional trust region method, introduced in \cite{ChenMenickellyScheinberg2014}. While traditional trust region methods rely on exact computations of the gradient, Hessian and values of the objective function, this method assumes that these values are available up to some dynamically adjusted accuracy. Moreover, this accuracy is assumed to hold only with some sufficiently large, but fixed, probability, without any additional restrictions on the variance of the errors. This setting applies, for example, to standard stochastic optimization and machine learning formulations. Improving upon the analysis in \cite{ChenMenickellyScheinberg2014}, we show that the stochastic process defined by the algorithm satisfies the assumptions of our proposed general framework, with the stopping time defined as reaching accuracy $\|\nabla f(x)\|\leq \epsilon$. The resulting bound for this stopping time is $O(\epsilon^{-2})$, under the assumption of sufficiently accurate stochastic gradient, and is the first global complexity bound for a stochastic trust-region method. Finally, we apply the same framework to derive second order complexity bound under some additional assumptions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

INFORMS journal on optimization

自引率

0.00%

发文量