{"title":"Convergence for nonconvex ADMM, with applications to CT imaging.","authors":"Rina Foygel Barber, Emil Y Sidky","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form <math><mi>m</mi> <mi>i</mi> <mi>n</mi> <mo>{</mo> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>+</mo> <mi>g</mi> <mo>(</mo> <mi>y</mi> <mo>)</mo> <mspace></mspace> <mo>:</mo> <mspace></mspace> <mi>A</mi> <mi>x</mi> <mo>+</mo> <mi>B</mi> <mi>y</mi> <mo>=</mo> <mi>c</mi> <mo>}</mo></math> . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions <math><mi>f</mi></math> and <math><mi>g</mi></math> , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both <math><mi>f</mi></math> and <math><mi>g</mi></math> are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11155492/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Machine Learning Research","FirstCategoryId":"94","ListUrlMain":"","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The alternating direction method of multipliers (ADMM) algorithm is a powerful and flexible tool for complex optimization problems of the form . ADMM exhibits robust empirical performance across a range of challenging settings including nonsmoothness and nonconvexity of the objective functions and , and provides a simple and natural approach to the inverse problem of image reconstruction for computed tomography (CT) imaging. From the theoretical point of view, existing results for convergence in the nonconvex setting generally assume smoothness in at least one of the component functions in the objective. In this work, our new theoretical results provide convergence guarantees under a restricted strong convexity assumption without requiring smoothness or differentiability, while still allowing differentiable terms to be treated approximately if needed. We validate these theoretical results empirically, with a simulated example where both and are nondifferentiable-and thus outside the scope of existing theory-as well as a simulated CT image reconstruction problem.
交替乘数方向法(ADMM)算法是一种强大而灵活的工具,可用于解决形式为 m i n { f ( x ) + g ( y ) : A x + B y = c } 的复杂优化问题。.ADMM 在目标函数 f 和 g 的非光滑性和非凸性等一系列挑战性设置中表现出稳健的经验性能,为计算机断层扫描 (CT) 成像的图像重建逆问题提供了一种简单而自然的方法。从理论角度来看,现有的非凸环境下的收敛结果一般都假设目标函数中至少有一个分量函数是平滑的。在这项工作中,我们的新理论结果提供了在受限强凸假设下的收敛保证,而不要求平滑性或可微性,同时还允许在需要时近似处理可微项。我们通过一个 f 和 g 都不可微的模拟例子(因此超出了现有理论的范围)以及一个模拟 CT 图像重建问题,对这些理论结果进行了经验验证。
期刊介绍:
The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.
JMLR has a commitment to rigorous yet rapid reviewing.
JMLR seeks previously unpublished papers on machine learning that contain:
new principled algorithms with sound empirical validation, and with justification of theoretical, psychological, or biological nature;
experimental and/or theoretical studies yielding new insight into the design and behavior of learning in intelligent systems;
accounts of applications of existing techniques that shed light on the strengths and weaknesses of the methods;
formalization of new learning tasks (e.g., in the context of new applications) and of methods for assessing performance on those tasks;
development of new analytical frameworks that advance theoretical studies of practical learning methods;
computational models of data from natural learning systems at the behavioral or neural level; or extremely well-written surveys of existing work.