{"title":"赫尔德连续赫西亚条件下非凸优化的通用重球法","authors":"Naoki Marumo, Akiko Takeda","doi":"10.1007/s10107-024-02100-4","DOIUrl":null,"url":null,"abstract":"<p>We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than <span>\\(\\varepsilon \\)</span> in <span>\\(O(H_{\\nu }^{\\frac{1}{2 + 2 \\nu }} \\varepsilon ^{- \\frac{4 + 3 \\nu }{2 + 2 \\nu }})\\)</span> function and gradient evaluations, where <span>\\(\\nu \\in [0, 1]\\)</span> and <span>\\(H_{\\nu }\\)</span> are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of <span>\\(O(\\varepsilon ^{-2})\\)</span> for <span>\\(\\nu = 0\\)</span> and the state-of-the-art bound of <span>\\(O(\\varepsilon ^{-7/4})\\)</span> for <span>\\(\\nu = 1\\)</span>. Our algorithm is <span>\\(\\nu \\)</span>-independent and thus universal; it automatically achieves the above complexity bound with the optimal <span>\\(\\nu \\in [0, 1]\\)</span> without knowledge of <span>\\(H_{\\nu }\\)</span>. In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy <span>\\(\\varepsilon \\)</span>. Numerical results illustrate that the proposed method is promising.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"51 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians\",\"authors\":\"Naoki Marumo, Akiko Takeda\",\"doi\":\"10.1007/s10107-024-02100-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than <span>\\\\(\\\\varepsilon \\\\)</span> in <span>\\\\(O(H_{\\\\nu }^{\\\\frac{1}{2 + 2 \\\\nu }} \\\\varepsilon ^{- \\\\frac{4 + 3 \\\\nu }{2 + 2 \\\\nu }})\\\\)</span> function and gradient evaluations, where <span>\\\\(\\\\nu \\\\in [0, 1]\\\\)</span> and <span>\\\\(H_{\\\\nu }\\\\)</span> are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of <span>\\\\(O(\\\\varepsilon ^{-2})\\\\)</span> for <span>\\\\(\\\\nu = 0\\\\)</span> and the state-of-the-art bound of <span>\\\\(O(\\\\varepsilon ^{-7/4})\\\\)</span> for <span>\\\\(\\\\nu = 1\\\\)</span>. Our algorithm is <span>\\\\(\\\\nu \\\\)</span>-independent and thus universal; it automatically achieves the above complexity bound with the optimal <span>\\\\(\\\\nu \\\\in [0, 1]\\\\)</span> without knowledge of <span>\\\\(H_{\\\\nu }\\\\)</span>. In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy <span>\\\\(\\\\varepsilon \\\\)</span>. Numerical results illustrate that the proposed method is promising.</p>\",\"PeriodicalId\":18297,\"journal\":{\"name\":\"Mathematical Programming\",\"volume\":\"51 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Programming\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10107-024-02100-4\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Programming","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10107-024-02100-4","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians
We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than \(\varepsilon \) in \(O(H_{\nu }^{\frac{1}{2 + 2 \nu }} \varepsilon ^{- \frac{4 + 3 \nu }{2 + 2 \nu }})\) function and gradient evaluations, where \(\nu \in [0, 1]\) and \(H_{\nu }\) are the Hölder exponent and constant, respectively. This complexity result covers the classical bound of \(O(\varepsilon ^{-2})\) for \(\nu = 0\) and the state-of-the-art bound of \(O(\varepsilon ^{-7/4})\) for \(\nu = 1\). Our algorithm is \(\nu \)-independent and thus universal; it automatically achieves the above complexity bound with the optimal \(\nu \in [0, 1]\) without knowledge of \(H_{\nu }\). In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient’s Lipschitz constant or the target accuracy \(\varepsilon \). Numerical results illustrate that the proposed method is promising.
期刊介绍:
Mathematical Programming publishes original articles dealing with every aspect of mathematical optimization; that is, everything of direct or indirect use concerning the problem of optimizing a function of many variables, often subject to a set of constraints. This involves theoretical and computational issues as well as application studies. Included, along with the standard topics of linear, nonlinear, integer, conic, stochastic and combinatorial optimization, are techniques for formulating and applying mathematical programming models, convex, nonsmooth and variational analysis, the theory of polyhedra, variational inequalities, and control and game theory viewed from the perspective of mathematical programming.