A Levenberg–Marquardt Method for Nonsmooth Regularized Least Squares

IF 3 2区数学 Q1 MATHEMATICS, APPLIED

SIAM Journal on Scientific Computing Pub Date : 2024-08-12 DOI:10.1137/22m1538971

Aleksandr Y. Aravkin, Robert Baraldi, Dominique Orban

{"title":"A Levenberg–Marquardt Method for Nonsmooth Regularized Least Squares","authors":"Aleksandr Y. Aravkin, Robert Baraldi, Dominique Orban","doi":"10.1137/22m1538971","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2557-A2581, August 2024. <br/> Abstract. We develop a Levenberg–Marquardt method for minimizing the sum of a smooth nonlinear least-squares term [math] and a nonsmooth term [math]. Both [math] and [math] may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of [math] using a first-order method such as the proximal gradient method. We establish global convergence to a first-order stationary point under the assumptions that [math] and its Jacobian are Lipschitz continuous and [math] is proper and lower semicontinuous. In the worst case, our method performs [math] iterations to bring a measure of stationarity below [math]. We also derive a trust-region variant that enjoys similar asymptotic worst-case iteration complexity as a special case of the trust-region algorithm of Aravkin, Baraldi, and Orban [SIAM J. Optim., 32 (2022), pp. 900–929]. We report numerical results on three examples: a group-lasso basis-pursuit denoise example, a nonlinear support vector machine, and parameter estimation in a neuroscience application. To implement those examples, we describe in detail how to evaluate proximal operators for separable [math] and for the group lasso with trust-region constraint. In all cases, the Levenberg–Marquardt methods perform fewer outer iterations than either a proximal gradient method with adaptive step length or a quasi-Newton trust-region method, neither of which exploit the least-squares structure of the problem. Our results also highlight the need for more sophisticated subproblem solvers than simple first-order methods.","PeriodicalId":49526,"journal":{"name":"SIAM Journal on Scientific Computing","volume":"34 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Scientific Computing","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1137/22m1538971","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

Abstract

SIAM Journal on Scientific Computing, Volume 46, Issue 4, Page A2557-A2581, August 2024.
Abstract. We develop a Levenberg–Marquardt method for minimizing the sum of a smooth nonlinear least-squares term [math] and a nonsmooth term [math]. Both [math] and [math] may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of [math] using a first-order method such as the proximal gradient method. We establish global convergence to a first-order stationary point under the assumptions that [math] and its Jacobian are Lipschitz continuous and [math] is proper and lower semicontinuous. In the worst case, our method performs [math] iterations to bring a measure of stationarity below [math]. We also derive a trust-region variant that enjoys similar asymptotic worst-case iteration complexity as a special case of the trust-region algorithm of Aravkin, Baraldi, and Orban [SIAM J. Optim., 32 (2022), pp. 900–929]. We report numerical results on three examples: a group-lasso basis-pursuit denoise example, a nonlinear support vector machine, and parameter estimation in a neuroscience application. To implement those examples, we describe in detail how to evaluate proximal operators for separable [math] and for the group lasso with trust-region constraint. In all cases, the Levenberg–Marquardt methods perform fewer outer iterations than either a proximal gradient method with adaptive step length or a quasi-Newton trust-region method, neither of which exploit the least-squares structure of the problem. Our results also highlight the need for more sophisticated subproblem solvers than simple first-order methods.

查看原文本刊更多论文

非光滑正则化最小二乘法的 Levenberg-Marquardt 方法

SIAM 科学计算期刊》，第 46 卷第 4 期，第 A2557-A2581 页，2024 年 8 月。摘要。我们开发了一种 Levenberg-Marquardt 方法，用于最小化平滑非线性最小二乘项 [math] 和非平滑项 [math] 之和。[math]和[math]都可能是非凸的。计算步骤是通过使用一阶方法（如近似梯度法）最小化正则化线性最小二乘模型与 [math] 模型之和。在[math]及其雅各布连为 Lipschitz 连续、[math]为适当的低半连续的假设条件下，我们建立了对一阶静止点的全局收敛。在最坏的情况下，我们的方法会进行 [math] 次迭代，使静止度低于 [math]。我们还推导了一个信任区域变体，它与 Aravkin、Baraldi 和 Orban 的信任区域算法的一个特例[SIAM J. Optim.，32 (2022)，第 900-929 页]具有相似的渐进最坏情况迭代复杂度。我们报告了三个实例的数值结果：一个组-拉索基搜索去噪实例、一个非线性支持向量机和一个神经科学应用中的参数估计。为了实现这些示例，我们详细描述了如何评估可分离[数学]的近算子和带有信任区域约束的群拉索。在所有情况下，Levenberg-Marquardt 方法的外部迭代次数都少于具有自适应步长的近似梯度方法或准牛顿信任区域方法，而这两种方法都没有利用问题的最小二乘结构。我们的研究结果还突出表明，除了简单的一阶方法外，还需要更复杂的子问题求解器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

SIAM Journal on Scientific Computing 数学-应用数学

CiteScore

5.50

自引率

3.20%

发文量

209

审稿时长

1 months

期刊介绍： The purpose of SIAM Journal on Scientific Computing (SISC) is to advance computational methods for solving scientific and engineering problems. SISC papers are classified into three categories: 1. Methods and Algorithms for Scientific Computing: Papers in this category may include theoretical analysis, provided that the relevance to applications in science and engineering is demonstrated. They should contain meaningful computational results and theoretical results or strong heuristics supporting the performance of new algorithms. 2. Computational Methods in Science and Engineering: Papers in this section will typically describe novel methodologies for solving a specific problem in computational science or engineering. They should contain enough information about the application to orient other computational scientists but should omit details of interest mainly to the applications specialist. 3. Software and High-Performance Computing: Papers in this category should concern the novel design and development of computational methods and high-quality software, parallel algorithms, high-performance computing issues, new architectures, data analysis, or visualization. The primary focus should be on computational methods that have potentially large impact for an important class of scientific or engineering problems.