{"title":"基于不定矩阵上CG的非孤立极小值信赖域的快速收敛。","authors":"Quentin Rebjock, Nicolas Boumal","doi":"10.1007/s10107-024-02140-w","DOIUrl":null,"url":null,"abstract":"<p><p>Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an <i>exact</i> subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular <i>inexact</i> subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"213 1-2","pages":"343-384"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401806/pdf/","citationCount":"0","resultStr":"{\"title\":\"Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices.\",\"authors\":\"Quentin Rebjock, Nicolas Boumal\",\"doi\":\"10.1007/s10107-024-02140-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an <i>exact</i> subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular <i>inexact</i> subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.</p>\",\"PeriodicalId\":18297,\"journal\":{\"name\":\"Mathematical Programming\",\"volume\":\"213 1-2\",\"pages\":\"343-384\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401806/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Programming\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10107-024-02140-w\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Programming","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10107-024-02140-w","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices.
Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an exact subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular inexact subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.
期刊介绍:
Mathematical Programming publishes original articles dealing with every aspect of mathematical optimization; that is, everything of direct or indirect use concerning the problem of optimizing a function of many variables, often subject to a set of constraints. This involves theoretical and computational issues as well as application studies. Included, along with the standard topics of linear, nonlinear, integer, conic, stochastic and combinatorial optimization, are techniques for formulating and applying mathematical programming models, convex, nonsmooth and variational analysis, the theory of polyhedra, variational inequalities, and control and game theory viewed from the perspective of mathematical programming.