DISH: A Distributed Hybrid Optimization Method Leveraging System Heterogeneity

IF 4.6 2区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Xiaochun Niu;Ermin Wei
{"title":"DISH: A Distributed Hybrid Optimization Method Leveraging System Heterogeneity","authors":"Xiaochun Niu;Ermin Wei","doi":"10.1109/TSP.2024.3450351","DOIUrl":null,"url":null,"abstract":"We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents’ computational capabilities, limiting their effectiveness. To address this, we propose DISH, a \n<underline>dis</u>\ntributed \n<underline>h</u>\nybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND (\n<underline>g</u>\nradient-\n<underline>r</u>\nelated \n<underline>a</u>\nscent a\n<underline>n</u>\nd \n<underline>d</u>\nescent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings to show the potentials and limitations of Newton's method in distributed settings. Numerical experiments validate the effectiveness of our methods.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"4007-4021"},"PeriodicalIF":4.6000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10648947/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents’ computational capabilities, limiting their effectiveness. To address this, we propose DISH, a dis tributed h ybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND ( g radient- r elated a scent a n d d escent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings to show the potentials and limitations of Newton's method in distributed settings. Numerical experiments validate the effectiveness of our methods.
DISH:利用系统异质性的分布式混合优化方法
我们研究多代理网络上的分布式优化问题,包括共识和网络流问题。现有的分布式方法忽视了代理计算能力的异质性,从而限制了其有效性。为了解决这个问题,我们提出了一种利用系统异质性的分布式混合方法 DISH。DISH 允许计算能力较强或计算成本较低的代理执行局部牛顿型更新,而其他代理则采用更简单的梯度型更新。值得注意的是,DISH 将 EXTRA、DIGing 和 ESOM-0 等现有方法作为特例。为了分析 DISH 在一般更新方向下的性能,我们将分布式问题表述为 minimax 问题,并引入 GRAND(梯度相关上升和下降)及其交替版本 Alt-GRAND,用于解决这些问题。GRAND 将 DISH 推广到集中式最小值设置中,在与部分梯度成锐角的范围内,容纳各种上升下降更新方向,包括梯度型、牛顿型、缩放梯度和其他一般方向。理论分析确定了 GRAND 和 Alt-GRAND 在强凸-非凹凸和强凸-PL 设置下的全局次线性和线性收敛率,并为 DISH 提供了线性收敛率。此外,我们还推导了集中式环境中基于牛顿的 GRAND 变体的局部超线性收敛,以展示牛顿方法在分布式环境中的潜力和局限性。数值实验验证了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing 工程技术-工程:电子与电气
CiteScore
11.20
自引率
9.30%
发文量
310
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信