非精确Hessian信息下非凸Newton-MR的复杂度保证

IF 2.4 2区数学 Q1 MATHEMATICS, APPLIED

IMA Journal of Numerical Analysis Pub Date : 2025-03-05 DOI:10.1093/imanum/drae110

Alexander Lim, Fred Roosta

{"title":"非精确Hessian信息下非凸Newton-MR的复杂度保证","authors":"Alexander Lim, Fred Roosta","doi":"10.1093/imanum/drae110","DOIUrl":null,"url":null,"abstract":"We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.","PeriodicalId":56295,"journal":{"name":"IMA Journal of Numerical Analysis","volume":"101 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Complexity guarantees for nonconvex Newton-MR under inexact Hessian information\",\"authors\":\"Alexander Lim, Fred Roosta\",\"doi\":\"10.1093/imanum/drae110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.\",\"PeriodicalId\":56295,\"journal\":{\"name\":\"IMA Journal of Numerical Analysis\",\"volume\":\"101 1\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IMA Journal of Numerical Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/imanum/drae110\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IMA Journal of Numerical Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imanum/drae110","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

摘要

我们考虑了非凸无约束优化的Newton-MR算法的扩展到逼近Hessian信息的设置。在Hessian矩阵上的特定噪声模型下，我们研究了这种变体的迭代和操作复杂性，以在几种非凸设置下获得适当的次优性准则。我们首先考虑满足（广义）Polyak -Łojasiewicz条件的函数，它是非凸函数的一个特殊子类。结果表明，在一定条件下，算法达到全局线性收敛速度。然后我们考虑更一般的非凸设置，其中获得一阶次最优的速率被证明是次线性的。在所有这些设置中，我们证明了我们的算法收敛，而不管Hessian近似的程度以及子问题解的准确性。最后，我们在几个机器学习问题上比较了我们的算法与几种替代算法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Complexity guarantees for nonconvex Newton-MR under inexact Hessian information

We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IMA Journal of Numerical Analysis 数学-应用数学

CiteScore

5.30

自引率

4.80%

发文量

审稿时长

6-12 weeks

期刊介绍： The IMA Journal of Numerical Analysis (IMAJNA) publishes original contributions to all fields of numerical analysis; articles will be accepted which treat the theory, development or use of practical algorithms and interactions between these aspects. Occasional survey articles are also published.