Stochastic Gauss–Seidel type inertial proximal alternating linearized minimization and its application to proximal neural networks

IF 1.2 4区数学 Q3 MATHEMATICS, APPLIED

Mathematical Methods of Operations Research Pub Date : 2024-02-06 DOI:10.1007/s00186-024-00851-6

{"title":"Stochastic Gauss–Seidel type inertial proximal alternating linearized minimization and its application to proximal neural networks","authors":"","doi":"10.1007/s00186-024-00851-6","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>In many optimization problems arising from machine learning, image processing, and statistics communities, the objective functions possess a special form involving huge amounts of data, which encourages the application of stochastic algorithms. In this paper, we study such a broad class of nonconvex nonsmooth minimization problems, whose objective function is the sum of a smooth function of the entire variables and two nonsmooth functions of each variable. We propose to solve this problem with a stochastic Gauss–Seidel type inertial proximal alternating linearized minimization (denoted by SGiPALM) algorithm. We prove that under Kurdyka–Łojasiewicz (KŁ) property and some mild conditions, each bounded sequence generated by SGiPALM with the variance-reduced stochastic gradient estimator globally converges to a critical point after a finite number of iterations, or almost surely satisfies the finite length property. We also apply the SGiPALM algorithm to the proximal neural networks (PNN) with 4 layers for classification tasks on the MNIST dataset and compare it with other deterministic and stochastic optimization algorithms, the results illustrate the effectiveness of the proposed algorithm.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":"18 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Methods of Operations Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00186-024-00851-6","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

Abstract

In many optimization problems arising from machine learning, image processing, and statistics communities, the objective functions possess a special form involving huge amounts of data, which encourages the application of stochastic algorithms. In this paper, we study such a broad class of nonconvex nonsmooth minimization problems, whose objective function is the sum of a smooth function of the entire variables and two nonsmooth functions of each variable. We propose to solve this problem with a stochastic Gauss–Seidel type inertial proximal alternating linearized minimization (denoted by SGiPALM) algorithm. We prove that under Kurdyka–Łojasiewicz (KŁ) property and some mild conditions, each bounded sequence generated by SGiPALM with the variance-reduced stochastic gradient estimator globally converges to a critical point after a finite number of iterations, or almost surely satisfies the finite length property. We also apply the SGiPALM algorithm to the proximal neural networks (PNN) with 4 layers for classification tasks on the MNIST dataset and compare it with other deterministic and stochastic optimization algorithms, the results illustrate the effectiveness of the proposed algorithm.

查看原文本刊更多论文

随机高斯-赛德尔型惯性近端交替线性化最小化及其在近端神经网络中的应用

摘要在机器学习、图像处理和统计领域出现的许多优化问题中，目标函数具有涉及海量数据的特殊形式，这就鼓励了随机算法的应用。本文研究了这样一大类非凸非光滑最小化问题，其目标函数是整个变量的光滑函数与每个变量的两个非光滑函数之和。我们建议用随机高斯-赛德尔式惯性近似交替线性化最小化算法（SGiPALM）来解决这个问题。我们证明，在 Kurdyka-Łojasiewicz (KŁ) 属性和一些温和条件下，SGiPALM 使用方差缩小随机梯度估计器生成的每个有界序列在有限次迭代后会全局收敛到临界点，或者几乎肯定满足有限长度属性。我们还将 SGiPALM 算法应用于在 MNIST 数据集上执行分类任务的 4 层近端神经网络 (PNN)，并将其与其他确定性和随机优化算法进行比较，结果表明了所提算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mathematical Methods of Operations Research 数学-应用数学

CiteScore

1.90

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： This peer reviewed journal publishes original and high-quality articles on important mathematical and computational aspects of operations research, in particular in the areas of continuous and discrete mathematical optimization, stochastics, and game theory. Theoretically oriented papers are supposed to include explicit motivations of assumptions and results, while application oriented papers need to contain substantial mathematical contributions. Suggestions for algorithms should be accompanied with numerical evidence for their superiority over state-of-the-art methods. Articles must be of interest for a large audience in operations research, written in clear and correct English, and typeset in LaTeX. A special section contains invited tutorial papers on advanced mathematical or computational aspects of operations research, aiming at making such methodologies accessible for a wider audience. All papers are refereed. The emphasis is on originality, quality, and importance.