A Variational Formulation of Accelerated Optimization on Riemannian Manifolds

IF 2.6 Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science Pub Date : 2021-01-16 DOI:10.1137/21m1395648

Valentin Duruisseaux, M. Leok

{"title":"A Variational Formulation of Accelerated Optimization on Riemannian Manifolds","authors":"Valentin Duruisseaux, M. Leok","doi":"10.1137/21m1395648","DOIUrl":null,"url":null,"abstract":"It was shown recently by Su et al. (2016) that Nesterov's accelerated gradient method for minimizing a smooth convex function $f$ can be thought of as the time discretization of a second-order ODE, and that $f(x(t))$ converges to its optimal value at a rate of $\\mathcal{O}(1/t^2)$ along any trajectory $x(t)$ of this ODE. A variational formulation was introduced in Wibisono et al. (2016) which allowed for accelerated convergence at a rate of $\\mathcal{O}(1/t^p)$, for arbitrary $p>0$, in normed vector spaces. This framework was exploited in Duruisseaux et al. (2021) to design efficient explicit algorithms for symplectic accelerated optimization. In Alimisis et al. (2020), a second-order ODE was proposed as the continuous-time limit of a Riemannian accelerated algorithm, and it was shown that the objective function $f(x(t))$ converges to its optimal value at a rate of $\\mathcal{O}(1/t^2)$ along solutions of this ODE. In this paper, we show that on Riemannian manifolds, the convergence rate of $f(x(t))$ to its optimal value can also be accelerated to an arbitrary convergence rate $\\mathcal{O}(1/t^p)$, by considering a family of time-dependent Bregman Lagrangian and Hamiltonian systems on Riemannian manifolds. This generalizes the results of Wibisono et al. (2016) to Riemannian manifolds and also provides a variational framework for accelerated optimization on Riemannian manifolds. An approach based on the time-invariance property of the family of Bregman Lagrangians and Hamiltonians was used to construct very efficient optimization algorithms in Duruisseaux et al. (2021), and we establish a similar time-invariance property in the Riemannian setting. One expects that a geometric numerical integrator that is time-adaptive, symplectic, and Riemannian manifold preserving will yield a class of promising optimization algorithms on manifolds.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":"649-674"},"PeriodicalIF":2.6000,"publicationDate":"2021-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM journal on mathematics of data science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/21m1395648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 18

Abstract

It was shown recently by Su et al. (2016) that Nesterov's accelerated gradient method for minimizing a smooth convex function $f$ can be thought of as the time discretization of a second-order ODE, and that $f(x(t))$ converges to its optimal value at a rate of $\mathcal{O}(1/t^2)$ along any trajectory $x(t)$ of this ODE. A variational formulation was introduced in Wibisono et al. (2016) which allowed for accelerated convergence at a rate of $\mathcal{O}(1/t^p)$, for arbitrary $p>0$, in normed vector spaces. This framework was exploited in Duruisseaux et al. (2021) to design efficient explicit algorithms for symplectic accelerated optimization. In Alimisis et al. (2020), a second-order ODE was proposed as the continuous-time limit of a Riemannian accelerated algorithm, and it was shown that the objective function $f(x(t))$ converges to its optimal value at a rate of $\mathcal{O}(1/t^2)$ along solutions of this ODE. In this paper, we show that on Riemannian manifolds, the convergence rate of $f(x(t))$ to its optimal value can also be accelerated to an arbitrary convergence rate $\mathcal{O}(1/t^p)$, by considering a family of time-dependent Bregman Lagrangian and Hamiltonian systems on Riemannian manifolds. This generalizes the results of Wibisono et al. (2016) to Riemannian manifolds and also provides a variational framework for accelerated optimization on Riemannian manifolds. An approach based on the time-invariance property of the family of Bregman Lagrangians and Hamiltonians was used to construct very efficient optimization algorithms in Duruisseaux et al. (2021), and we establish a similar time-invariance property in the Riemannian setting. One expects that a geometric numerical integrator that is time-adaptive, symplectic, and Riemannian manifold preserving will yield a class of promising optimization algorithms on manifolds.

查看原文本刊更多论文

黎曼流形加速优化的变分公式

Su et al.(2016)最近表明，用于最小化光滑凸函数$f$的Nesterov加速梯度方法可以被认为是二阶ODE的时间离散化，并且$f(x(t))$沿着该ODE的任何轨迹$x(t)$以$\mathcal{O}(1/t^2)$的速率收敛到其最优值。Wibisono等人(2016)引入了一个变分公式，该公式允许在赋范向量空间中以$\mathcal{O}(1/t^p)$的速率加速收敛，对于任意$p>0$。Duruisseaux等人(2021)利用该框架为辛加速优化设计了高效的显式算法。在Alimisis et al.(2020)中，提出了二阶ODE作为黎曼加速算法的连续时间极限，并证明了目标函数f(x(t))$沿该ODE的解以$\mathcal{O}(1/t^2)$的速率收敛到其最优值。在黎曼流形上，通过考虑黎曼流形上的一类时变布雷格曼-拉格朗日系统和哈密顿系统，我们证明了f(x(t))$到其最优值的收敛速率也可以加速到任意收敛速率$\mathcal{O}(1/t^p)$。这将Wibisono等人(2016)的结果推广到黎曼流形，并为黎曼流形的加速优化提供了变分框架。Duruisseaux等人(2021)利用布雷格曼-拉格朗日算子和哈密顿算子族的时不变性质构建了非常高效的优化算法，我们在黎曼设置中建立了类似的时不变性质。人们期望一个具有时间适应性、辛性和保持黎曼流形的几何数值积分器将产生一类有前途的流形优化算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

SIAM journal on mathematics of data science

自引率

0.00%

发文量