基于不定矩阵上CG的非孤立极小值信赖域的快速收敛。

IF 2.5 2区数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Mathematical Programming Pub Date : 2025-01-01 Epub Date: 2024-10-18 DOI:10.1007/s10107-024-02140-w

Quentin Rebjock, Nicolas Boumal

{"title":"基于不定矩阵上CG的非孤立极小值信赖域的快速收敛。","authors":"Quentin Rebjock, Nicolas Boumal","doi":"10.1007/s10107-024-02140-w","DOIUrl":null,"url":null,"abstract":"Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an exact subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular inexact subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"213 1-2","pages":"343-384"},"PeriodicalIF":2.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401806/pdf/","citationCount":"0","resultStr":"{\"title\":\"Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices.\",\"authors\":\"Quentin Rebjock, Nicolas Boumal\",\"doi\":\"10.1007/s10107-024-02140-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an exact subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular inexact subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.\",\"PeriodicalId\":18297,\"journal\":{\"name\":\"Mathematical Programming\",\"volume\":\"213 1-2\",\"pages\":\"343-384\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12401806/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Programming\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10107-024-02140-w\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Programming","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10107-024-02140-w","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

当Hessian是正定时，信赖域方法可以二次收敛到极小值。然而，如果最小值不是孤立的，那么黑森矩阵就不能是正确定的。较弱的Polyak-Łojasiewicz （PŁ）条件与非孤立最小值兼容，并且足以使许多算法保持良好的局部行为。然而，带有精确子问题求解器的TR甚至缺乏基本的特征，例如PŁ下的捕获定理。在实践中，常用的非精确子问题求解方法是截断共轭梯度法（tCG）。经验上，TR-tCG在PŁ下表现出超线性收敛。我们从理论上证实了这一点。主要的数学障碍是，在PŁ下，在任意接近最小值的点上，黑森函数的特征值非常小，可能是负的。因此，tCG适用于病态的、不确定的系统。然而，tCG的核心理论是CG理论，它假设了一个正定算子。因此，我们开发了新的工具来分析在任何符号的小特征值存在下的CG动力学，用于TR-tCG感兴趣的政权。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices.

查看原文本刊更多论文

Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices.

Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak-Łojasiewicz (PŁ) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an exact subproblem solver lacks even basic features such as a capture theorem under PŁ. In practice, a popular inexact subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits superlinear convergence under PŁ. We confirm this theoretically. The main mathematical obstacle is that, under PŁ, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mathematical Programming 数学-计算机：软件工程

CiteScore

5.70

自引率

11.10%

发文量

160

审稿时长

4-8 weeks

期刊介绍： Mathematical Programming publishes original articles dealing with every aspect of mathematical optimization; that is, everything of direct or indirect use concerning the problem of optimizing a function of many variables, often subject to a set of constraints. This involves theoretical and computational issues as well as application studies. Included, along with the standard topics of linear, nonlinear, integer, conic, stochastic and combinatorial optimization, are techniques for formulating and applying mathematical programming models, convex, nonsmooth and variational analysis, the theory of polyhedra, variational inequalities, and control and game theory viewed from the perspective of mathematical programming.