{"title":"Faster Convergence With Less Communication: Broadcast-Based Subgraph Sampling for Decentralized Learning Over Wireless Networks","authors":"Daniel Pérez Herrera;Zheng Chen;Erik G. Larsson","doi":"10.1109/OJCOMS.2025.3540133","DOIUrl":null,"url":null,"abstract":"Decentralized stochastic gradient descent (D-SGD) is a widely adopted optimization algorithm for decentralized training of machine learning models across networked agents. A crucial part of D-SGD is the consensus-based model averaging, which heavily relies on information exchange and fusion among the nodes. For consensus averaging over wireless networks, due to the broadcast nature of wireless channels, simultaneous transmissions from multiple nodes may cause packet collisions if they share a common receiver. Therefore, communication coordination is necessary to determine when and how a node can transmit (or receive) information to (or from) its neighbors. In this work, we propose <monospace>BASS</monospace>, a broadcast-based subgraph sampling method designed to accelerate the convergence of D-SGD while considering the actual communication cost per iteration. <monospace>BASS</monospace> creates a set of mixing matrix candidates that represent sparser subgraphs of the base topology. In each consensus iteration, one mixing matrix is randomly sampled, leading to a specific scheduling decision that activates multiple collision-free subsets of nodes. The sampling occurs in a probabilistic manner, and the elements of the mixing matrices, along with their sampling probabilities, are jointly optimized. Simulation results demonstrate that <monospace>BASS</monospace> achieves faster convergence and requires fewer transmission slots than existing link-based scheduling methods and the full communication scenario. In conclusion, the inherent broadcasting nature of wireless channels offers intrinsic advantages in accelerating the convergence of decentralized optimization and learning.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":"6 ","pages":"1497-1511"},"PeriodicalIF":6.3000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10879080","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10879080/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Faster Convergence With Less Communication: Broadcast-Based Subgraph Sampling for Decentralized Learning Over Wireless Networks
Decentralized stochastic gradient descent (D-SGD) is a widely adopted optimization algorithm for decentralized training of machine learning models across networked agents. A crucial part of D-SGD is the consensus-based model averaging, which heavily relies on information exchange and fusion among the nodes. For consensus averaging over wireless networks, due to the broadcast nature of wireless channels, simultaneous transmissions from multiple nodes may cause packet collisions if they share a common receiver. Therefore, communication coordination is necessary to determine when and how a node can transmit (or receive) information to (or from) its neighbors. In this work, we propose BASS, a broadcast-based subgraph sampling method designed to accelerate the convergence of D-SGD while considering the actual communication cost per iteration. BASS creates a set of mixing matrix candidates that represent sparser subgraphs of the base topology. In each consensus iteration, one mixing matrix is randomly sampled, leading to a specific scheduling decision that activates multiple collision-free subsets of nodes. The sampling occurs in a probabilistic manner, and the elements of the mixing matrices, along with their sampling probabilities, are jointly optimized. Simulation results demonstrate that BASS achieves faster convergence and requires fewer transmission slots than existing link-based scheduling methods and the full communication scenario. In conclusion, the inherent broadcasting nature of wireless channels offers intrinsic advantages in accelerating the convergence of decentralized optimization and learning.
期刊介绍:
The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023.
The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include:
Systems and network architecture, control and management
Protocols, software, and middleware
Quality of service, reliability, and security
Modulation, detection, coding, and signaling
Switching and routing
Mobile and portable communications
Terminals and other end-user devices
Networks for content distribution and distributed computing
Communications-based distributed resources control.