Yifan Hua, Jinlong Pang, Xiaoxue Zhang, Yi Liu, Xiaofeng Shi, Bao Wang, Yang Liu, Chen Qian
{"title":"Towards Practical Overlay Networks for Decentralized Federated Learning","authors":"Yifan Hua, Jinlong Pang, Xiaoxue Zhang, Yi Liu, Xiaofeng Shi, Bao Wang, Yang Liu, Chen Qian","doi":"arxiv-2409.05331","DOIUrl":null,"url":null,"abstract":"Decentralized federated learning (DFL) uses peer-to-peer communication to\navoid the single point of failure problem in federated learning and has been\nconsidered an attractive solution for machine learning tasks on distributed\ndevices. We provide the first solution to a fundamental network problem of DFL:\nwhat overlay network should DFL use to achieve fast training of highly accurate\nmodels, low communication, and decentralized construction and maintenance?\nOverlay topologies of DFL have been investigated, but no existing DFL topology\nincludes decentralized protocols for network construction and topology\nmaintenance. Without these protocols, DFL cannot run in practice. This work\npresents an overlay network, called FedLay, which provides fast training and\nlow communication cost for practical DFL. FedLay is the first solution for\nconstructing near-random regular topologies in a decentralized manner and\nmaintaining the topologies under node joins and failures. Experiments based on\nprototype implementation and simulations show that FedLay achieves the fastest\nmodel convergence and highest accuracy on real datasets compared to existing\nDFL solutions while incurring small communication costs and being resilient to\nnode joins and failures.","PeriodicalId":501280,"journal":{"name":"arXiv - CS - Networking and Internet Architecture","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Networking and Internet Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Decentralized federated learning (DFL) uses peer-to-peer communication to
avoid the single point of failure problem in federated learning and has been
considered an attractive solution for machine learning tasks on distributed
devices. We provide the first solution to a fundamental network problem of DFL:
what overlay network should DFL use to achieve fast training of highly accurate
models, low communication, and decentralized construction and maintenance?
Overlay topologies of DFL have been investigated, but no existing DFL topology
includes decentralized protocols for network construction and topology
maintenance. Without these protocols, DFL cannot run in practice. This work
presents an overlay network, called FedLay, which provides fast training and
low communication cost for practical DFL. FedLay is the first solution for
constructing near-random regular topologies in a decentralized manner and
maintaining the topologies under node joins and failures. Experiments based on
prototype implementation and simulations show that FedLay achieves the fastest
model convergence and highest accuracy on real datasets compared to existing
DFL solutions while incurring small communication costs and being resilient to
node joins and failures.