Zero Order Algorithm for Decentralized Optimization Problems

IF 0.5 4区 数学 Q3 MATHEMATICS
A. S. Veprikov, E. D. Petrov, G. V. Evseev, A. N. Beznosikov
{"title":"Zero Order Algorithm for Decentralized Optimization Problems","authors":"A. S. Veprikov,&nbsp;E. D. Petrov,&nbsp;G. V. Evseev,&nbsp;A. N. Beznosikov","doi":"10.1134/S1064562424602336","DOIUrl":null,"url":null,"abstract":"<p>In this paper we consider a distributed optimization problem in the black-box formulation. This means that the target function   <i>f</i> is decomposed into the sum of <span>\\(n\\)</span> functions <span>\\({{f}_{i}}\\)</span>, where <span>\\(n\\)</span> is the number of workers, it is assumed that each worker has access only to the zero-order noisy oracle, i.e., only to the values of <span>\\({{f}_{i}}(x)\\)</span> with added noise. In this paper, we propose a new method <span>ZO-MARINA</span> based on the state-of-the-art distributed optimization algorithm <i><span>MARINA</span></i>. In particular, the following modifications are made to solve the problem in the black-box formulation: (i) we use approximations of the gradient instead of the true value, (ii) the difference of two approximated gradients at some coordinates is used instead of the compression operator. In this paper, a theoretical convergence analysis is provided for non-convex functions and functions satisfying the PL condition. The convergence rate of the proposed algorithm is correlated with the results for the algorithm that uses the first-order oracle. The theoretical results are validated in computational experiments to find optimal hyperparameters for the Resnet-18 neural network, that is trained on the CIFAR-10 dataset and the SVM model on the LibSVM library dataset and on the Mnist-784 dataset.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S261 - S277"},"PeriodicalIF":0.5000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1134/S1064562424602336.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602336","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper we consider a distributed optimization problem in the black-box formulation. This means that the target function   f is decomposed into the sum of \(n\) functions \({{f}_{i}}\), where \(n\) is the number of workers, it is assumed that each worker has access only to the zero-order noisy oracle, i.e., only to the values of \({{f}_{i}}(x)\) with added noise. In this paper, we propose a new method ZO-MARINA based on the state-of-the-art distributed optimization algorithm MARINA. In particular, the following modifications are made to solve the problem in the black-box formulation: (i) we use approximations of the gradient instead of the true value, (ii) the difference of two approximated gradients at some coordinates is used instead of the compression operator. In this paper, a theoretical convergence analysis is provided for non-convex functions and functions satisfying the PL condition. The convergence rate of the proposed algorithm is correlated with the results for the algorithm that uses the first-order oracle. The theoretical results are validated in computational experiments to find optimal hyperparameters for the Resnet-18 neural network, that is trained on the CIFAR-10 dataset and the SVM model on the LibSVM library dataset and on the Mnist-784 dataset.

分散优化问题的零阶算法
本文考虑了黑箱公式中的一个分布式优化问题。这意味着目标函数f被分解为\(n\)函数\({{f}_{i}}\)的和,其中\(n\)是工作人员的数量,假设每个工作人员只能访问零阶有噪声的oracle,即只能访问带有附加噪声的\({{f}_{i}}(x)\)的值。本文在最先进的分布式优化算法MARINA的基础上,提出了一种新的算法ZO-MARINA。特别地,为了解决黑箱公式中的问题,我们做了以下修改:(i)我们使用梯度的近似值而不是真值,(ii)使用两个近似梯度在某些坐标处的差值而不是压缩算子。本文给出了非凸函数和满足PL条件的函数的理论收敛性分析。该算法的收敛速度与使用一阶oracle的算法的结果相关。利用CIFAR-10数据集和LibSVM库数据集和mist -784数据集上的SVM模型对Resnet-18神经网络进行训练,并通过计算实验验证了理论结果,找到了最优超参数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Doklady Mathematics
Doklady Mathematics 数学-数学
CiteScore
1.00
自引率
16.70%
发文量
39
审稿时长
3-6 weeks
期刊介绍: Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信