{"title":"Complexity guarantees for nonconvex Newton-MR under inexact Hessian information","authors":"Alexander Lim, Fred Roosta","doi":"10.1093/imanum/drae110","DOIUrl":null,"url":null,"abstract":"We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.","PeriodicalId":56295,"journal":{"name":"IMA Journal of Numerical Analysis","volume":"101 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IMA Journal of Numerical Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imanum/drae110","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.
期刊介绍:
The IMA Journal of Numerical Analysis (IMAJNA) publishes original contributions to all fields of numerical analysis; articles will be accepted which treat the theory, development or use of practical algorithms and interactions between these aspects. Occasional survey articles are also published.