用于非凸优化的方差缩小重洗梯度下降法：集中式和分布式算法

IF 4.8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Automatica Pub Date : 2024-09-30 DOI:10.1016/j.automatica.2024.111954

Xia Jiang , Xianlin Zeng , Lihua Xie , Jian Sun , Jie Chen

{"title":"用于非凸优化的方差缩小重洗梯度下降法：集中式和分布式算法","authors":"Xia Jiang , Xianlin Zeng , Lihua Xie , Jian Sun , Jie Chen","doi":"10.1016/j.automatica.2024.111954","DOIUrl":null,"url":null,"abstract":"<div><div>Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka–Łojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance-reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms.</div></div>","PeriodicalId":55413,"journal":{"name":"Automatica","volume":"171 ","pages":"Article 111954"},"PeriodicalIF":4.8000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Variance-reduced reshuffling gradient descent for nonconvex optimization: Centralized and distributed algorithms\",\"authors\":\"Xia Jiang , Xianlin Zeng , Lihua Xie , Jian Sun , Jie Chen\",\"doi\":\"10.1016/j.automatica.2024.111954\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka–Łojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance-reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms.</div></div>\",\"PeriodicalId\":55413,\"journal\":{\"name\":\"Automatica\",\"volume\":\"171 \",\"pages\":\"Article 111954\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automatica\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0005109824004485\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automatica","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0005109824004485","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

非凸有限和优化在信号处理和机器学习中发挥着至关重要的作用，推动了众多集中式和分布式随机算法的发展。然而，现有的随机优化算法由于使用了替换随机抽样，往往存在随机梯度方差过高的问题。为解决这一问题，本文引入了明确的方差降低步骤，并提出了采用无替换抽样方案的方差降低重洗牌梯度算法。具体来说，本文证明了在 Kurdyka-Łojasiewicz 条件下，所提出的具有恒定步长的集中式方差缩小重洗牌梯度算法（VR-RG）能收敛到非凸优化的静止点。此外，对于连接多代理网络的非凸优化，所提出的分布式方差缩小重洗牌梯度算法（DVR-RG）能收敛到一个静止点邻域，在温和条件下，邻域可任意变小。值得注意的是，所提出的 DVR-RG 在每个纪元只需要一轮通信。最后，数值模拟证明了所提算法的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Variance-reduced reshuffling gradient descent for nonconvex optimization: Centralized and distributed algorithms

Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka–Łojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance-reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automatica 工程技术-工程：电子与电气

CiteScore

10.70

自引率

7.80%

发文量

617

审稿时长

5 months

期刊介绍： Automatica is a leading archival publication in the field of systems and control. The field encompasses today a broad set of areas and topics, and is thriving not only within itself but also in terms of its impact on other fields, such as communications, computers, biology, energy and economics. Since its inception in 1963, Automatica has kept abreast with the evolution of the field over the years, and has emerged as a leading publication driving the trends in the field. After being founded in 1963, Automatica became a journal of the International Federation of Automatic Control (IFAC) in 1969. It features a characteristic blend of theoretical and applied papers of archival, lasting value, reporting cutting edge research results by authors across the globe. It features articles in distinct categories, including regular, brief and survey papers, technical communiqués, correspondence items, as well as reviews on published books of interest to the readership. It occasionally publishes special issues on emerging new topics or established mature topics of interest to a broad audience. Automatica solicits original high-quality contributions in all the categories listed above, and in all areas of systems and control interpreted in a broad sense and evolving constantly. They may be submitted directly to a subject editor or to the Editor-in-Chief if not sure about the subject area. Editorial procedures in place assure careful, fair, and prompt handling of all submitted articles. Accepted papers appear in the journal in the shortest time feasible given production time constraints.