DRACO：基于行随机无线网络的分散式异步联邦学习

IF 6.3 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Open Journal of the Communications Society Pub Date : 2025-03-27 DOI:10.1109/OJCOMS.2025.3574098

Eunjeong Jeong;Marios Kountouris

{"title":"DRACO：基于行随机无线网络的分散式异步联邦学习","authors":"Eunjeong Jeong;Marios Kountouris","doi":"10.1109/OJCOMS.2025.3574098","DOIUrl":null,"url":null,"abstract":"Emerging technologies and use cases, such as smart Internet of Things (IoT), Internet of Agents, and Edge AI, have generated significant interest in training neural networks over fully decentralized, serverless networks. A major obstacle in this context is ensuring stable convergence without imposing stringent assumptions, such as identical data distributions across devices or synchronized updates. In this paper, we introduce DRACO, a novel framework for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks. Our approach leverages continuous communication, allowing edge devices to perform local training and exchange model updates along a continuous timeline, thereby eliminating the need for synchronized timing. Additionally, our algorithm decouples communication and computation schedules, enabling complete autonomy for all users while effectively addressing straggler issues. Through a thorough convergence analysis, we show that DRACO achieves high performance in decentralized optimization while maintaining low variance across users even without predefined scheduling policies. Numerical experiments further validate the effectiveness of our approach, demonstrating that controlling the maximum number of received messages per client significantly reduces redundant communication costs while maintaining robust learning performance.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":"6 ","pages":"4818-4839"},"PeriodicalIF":6.3000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11016099","citationCount":"0","resultStr":"{\"title\":\"DRACO: Decentralized Asynchronous Federated Learning Over Row-Stochastic Wireless Networks\",\"authors\":\"Eunjeong Jeong;Marios Kountouris\",\"doi\":\"10.1109/OJCOMS.2025.3574098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emerging technologies and use cases, such as smart Internet of Things (IoT), Internet of Agents, and Edge AI, have generated significant interest in training neural networks over fully decentralized, serverless networks. A major obstacle in this context is ensuring stable convergence without imposing stringent assumptions, such as identical data distributions across devices or synchronized updates. In this paper, we introduce DRACO, a novel framework for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks. Our approach leverages continuous communication, allowing edge devices to perform local training and exchange model updates along a continuous timeline, thereby eliminating the need for synchronized timing. Additionally, our algorithm decouples communication and computation schedules, enabling complete autonomy for all users while effectively addressing straggler issues. Through a thorough convergence analysis, we show that DRACO achieves high performance in decentralized optimization while maintaining low variance across users even without predefined scheduling policies. Numerical experiments further validate the effectiveness of our approach, demonstrating that controlling the maximum number of received messages per client significantly reduces redundant communication costs while maintaining robust learning performance.\",\"PeriodicalId\":33803,\"journal\":{\"name\":\"IEEE Open Journal of the Communications Society\",\"volume\":\"6 \",\"pages\":\"4818-4839\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11016099\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11016099/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11016099/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

新兴技术和用例，如智能物联网（IoT）、代理互联网和边缘人工智能，已经引起了人们对在完全分散的无服务器网络上训练神经网络的极大兴趣。在这种情况下的一个主要障碍是在不施加严格假设的情况下确保稳定的融合，例如跨设备的相同数据分布或同步更新。本文介绍了一种基于行随机闲谈无线网络的去中心化异步随机梯度下降（SGD）框架DRACO。我们的方法利用持续通信，允许边缘设备沿着连续的时间轴执行本地训练和交换模型更新，从而消除了同步定时的需要。此外，我们的算法解耦了通信和计算调度，为所有用户提供了完全的自主权，同时有效地解决了掉队问题。通过全面的收敛性分析，我们表明即使没有预定义的调度策略，DRACO也能在分散优化中实现高性能，同时保持用户之间的低方差。数值实验进一步验证了我们方法的有效性，表明控制每个客户端接收消息的最大数量可以显着降低冗余通信成本，同时保持稳健的学习性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DRACO: Decentralized Asynchronous Federated Learning Over Row-Stochastic Wireless Networks

Emerging technologies and use cases, such as smart Internet of Things (IoT), Internet of Agents, and Edge AI, have generated significant interest in training neural networks over fully decentralized, serverless networks. A major obstacle in this context is ensuring stable convergence without imposing stringent assumptions, such as identical data distributions across devices or synchronized updates. In this paper, we introduce DRACO, a novel framework for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks. Our approach leverages continuous communication, allowing edge devices to perform local training and exchange model updates along a continuous timeline, thereby eliminating the need for synchronized timing. Additionally, our algorithm decouples communication and computation schedules, enabling complete autonomy for all users while effectively addressing straggler issues. Through a thorough convergence analysis, we show that DRACO achieves high performance in decentralized optimization while maintaining low variance across users even without predefined scheduling policies. Numerical experiments further validate the effectiveness of our approach, demonstrating that controlling the maximum number of received messages per client significantly reduces redundant communication costs while maintaining robust learning performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of the Communications Society Multiple-

CiteScore

13.70

自引率

3.80%

发文量

审稿时长

10 weeks

期刊介绍： The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023. The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include: Systems and network architecture, control and management Protocols, software, and middleware Quality of service, reliability, and security Modulation, detection, coding, and signaling Switching and routing Mobile and portable communications Terminals and other end-user devices Networks for content distribution and distributed computing Communications-based distributed resources control.