Self-adaptive asynchronous federated optimizer with adversarial sharpness-aware minimization

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
{"title":"Self-adaptive asynchronous federated optimizer with adversarial sharpness-aware minimization","authors":"","doi":"10.1016/j.future.2024.07.045","DOIUrl":null,"url":null,"abstract":"<div><p>The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004175","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The past years have witnessed the success of a distributed learning system called Federated Learning (FL). Recently, asynchronous FL (AFL) has demonstrated its potential in concurrency compared to mainstream synchronous FL. However, the inherent systematic and statistical heterogeneity has presented several impediments to AFL: On the client side, the discrepancies in trips and local model drift impede global performance enhancement; On the server side, dynamic communication leads to significant fluctuations in gradient arrival time, while asynchronous arrival gradients with ambiguous value are not fully leveraged. In this paper, we propose an adaptive AFL framework, ARDAGH, which systematically addresses the aforementioned challenges: Firstly, to address the discrepancies in client trips, ARDAGH ensures their convergence by incorporating only 1-bit feedback information into the downlink. Secondly, to counter the drift of clients, ARDAGH generalizes the local models by employing our novel adversarial sharpness-aware minimization, which does not necessitate reliance on additional global variables. Thirdly, in the face of gradient latency issues, ARDAGH employs a communication-aware dropout strategy to adaptively compress gradients to ensure similar transmission times. Finally, to fully unleash the potential of each gradient, we establish a consistent optimal direction by conceptualizing the aggregation as an optimizer with successive momentum. In light of the comprehensive solution provided by ARDAGH, an algorithm named FedAMO is derived, and its superiority is confirmed by experimental results obtained under challenging prototype and simulation settings. Particularly in typical sentiment analysis tasks, FedAMO demonstrates an improvement of up to 5.351% with a 20.056-fold acceleration compared to conventional asynchronous methods.

具有对抗性锐度感知最小化功能的自适应异步联合优化器
过去几年中,一种名为 "联合学习"(FL)的分布式学习系统取得了成功。最近,与主流的同步学习系统相比,异步学习系统(AFL)在并发性方面显示出了潜力。然而,固有的系统和统计异质性给 AFL 带来了一些障碍:在客户端,行程差异和局部模型漂移阻碍了全局性能的提升;在服务器端,动态通信导致梯度到达时间大幅波动,而具有模糊值的异步到达梯度则无法充分利用。本文提出了一种自适应 AFL 框架 ARDAGH,系统地解决了上述难题:首先,针对客户端行程的差异,ARDAGH 通过在下行链路中仅加入 1 位反馈信息来确保其收敛。其次,为了应对客户端的漂移,ARDAGH 通过采用我们新颖的对抗性锐度感知最小化技术,对局部模型进行了扩展,从而无需依赖额外的全局变量。第三,面对梯度延迟问题,ARDAGH 采用了通信感知放弃策略,自适应地压缩梯度,以确保相似的传输时间。最后,为了充分发挥每个梯度的潜力,我们将聚合概念化为具有连续动力的优化器,从而建立了一致的优化方向。根据 ARDAGH 提供的综合解决方案,我们推导出了一种名为 FedAMO 的算法,并通过在具有挑战性的原型和模拟设置下获得的实验结果证实了该算法的优越性。特别是在典型的情感分析任务中,与传统的异步方法相比,FedAMO 的性能提高了 5.351%,加速了 20.056 倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信