Zero Order Algorithm for Decentralized Optimization Problems

IF 0.6 4区数学 Q3 MATHEMATICS

Doklady Mathematics Pub Date : 2025-03-22 DOI:10.1134/S1064562424602336

A. S. Veprikov, E. D. Petrov, G. V. Evseev, A. N. Beznosikov

{"title":"Zero Order Algorithm for Decentralized Optimization Problems","authors":"A. S. Veprikov, E. D. Petrov, G. V. Evseev, A. N. Beznosikov","doi":"10.1134/S1064562424602336","DOIUrl":null,"url":null,"abstract":"In this paper we consider a distributed optimization problem in the black-box formulation. This means that the target function f is decomposed into the sum of \\(n\\) functions \\({{f}_{i}}\\), where \\(n\\) is the number of workers, it is assumed that each worker has access only to the zero-order noisy oracle, i.e., only to the values of \\({{f}_{i}}(x)\\) with added noise. In this paper, we propose a new method ZO-MARINA based on the state-of-the-art distributed optimization algorithm MARINA. In particular, the following modifications are made to solve the problem in the black-box formulation: (i) we use approximations of the gradient instead of the true value, (ii) the difference of two approximated gradients at some coordinates is used instead of the compression operator. In this paper, a theoretical convergence analysis is provided for non-convex functions and functions satisfying the PL condition. The convergence rate of the proposed algorithm is correlated with the results for the algorithm that uses the first-order oracle. The theoretical results are validated in computational experiments to find optimal hyperparameters for the Resnet-18 neural network, that is trained on the CIFAR-10 dataset and the SVM model on the LibSVM library dataset and on the Mnist-784 dataset.","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S261 - S277"},"PeriodicalIF":0.6000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1134/S1064562424602336.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602336","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper we consider a distributed optimization problem in the black-box formulation. This means that the target function f is decomposed into the sum of \(n\) functions \({{f}_{i}}\), where \(n\) is the number of workers, it is assumed that each worker has access only to the zero-order noisy oracle, i.e., only to the values of \({{f}_{i}}(x)\) with added noise. In this paper, we propose a new method ZO-MARINA based on the state-of-the-art distributed optimization algorithm MARINA. In particular, the following modifications are made to solve the problem in the black-box formulation: (i) we use approximations of the gradient instead of the true value, (ii) the difference of two approximated gradients at some coordinates is used instead of the compression operator. In this paper, a theoretical convergence analysis is provided for non-convex functions and functions satisfying the PL condition. The convergence rate of the proposed algorithm is correlated with the results for the algorithm that uses the first-order oracle. The theoretical results are validated in computational experiments to find optimal hyperparameters for the Resnet-18 neural network, that is trained on the CIFAR-10 dataset and the SVM model on the LibSVM library dataset and on the Mnist-784 dataset.

查看原文本刊更多论文

分散优化问题的零阶算法

本文考虑了黑箱公式中的一个分布式优化问题。这意味着目标函数f被分解为\(n\)函数\({{f}_{i}}\)的和，其中\(n\)是工作人员的数量，假设每个工作人员只能访问零阶有噪声的oracle，即只能访问带有附加噪声的\({{f}_{i}}(x)\)的值。本文在最先进的分布式优化算法MARINA的基础上，提出了一种新的算法ZO-MARINA。特别地，为了解决黑箱公式中的问题，我们做了以下修改：(i)我们使用梯度的近似值而不是真值，（ii）使用两个近似梯度在某些坐标处的差值而不是压缩算子。本文给出了非凸函数和满足PL条件的函数的理论收敛性分析。该算法的收敛速度与使用一阶oracle的算法的结果相关。利用CIFAR-10数据集和LibSVM库数据集和mist -784数据集上的SVM模型对Resnet-18神经网络进行训练，并通过计算实验验证了理论结果，找到了最优超参数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Doklady Mathematics 数学-数学

CiteScore

1.00

自引率

16.70%

发文量

审稿时长

3-6 weeks

期刊介绍： Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.