Duo Yang , Yunqi Gao , Bing Hu , A-Long Jin , Wei Wang , Yang You
{"title":"GWPF: Communication-efficient federated learning with Gradient-Wise Parameter Freezing","authors":"Duo Yang , Yunqi Gao , Bing Hu , A-Long Jin , Wei Wang , Yang You","doi":"10.1016/j.comnet.2024.110886","DOIUrl":null,"url":null,"abstract":"<div><div>Communication bottleneck is a critical challenge in federated learning. While parameter freezing has emerged as a popular approach, utilizing fine-grained parameters as aggregation objects, existing methods suffer from issues such as a lack of thawing strategy, lag and inflexibility in the thawing process, and underutilization of frozen parameters’ updates. To address these challenges, we propose Gradient-Wise Parameter Freezing (GWPF), a mechanism that wisely controls frozen periods for different parameters through parameter freezing and thawing strategies. GWPF globally freezes parameters with insignificant gradients and excludes frozen parameters from global updates during the frozen period, reducing communication overhead and accelerating training. The thawing strategy, based on global decisions by the server and collaboration with clients, leverages real-time feedback on the locally accumulated gradients of frozen parameters in each round, achieving a balanced approach between mitigating communication and enhancing model accuracy. We provide theoretical analysis and a convergence guarantee for non-convex objectives. Extensive experiments confirm that our mechanism achieves a speedup of up to 4.52 times in time-to-accuracy performance and reduces communication overhead by up to 48.73%. It also improves final model accuracy by up to 2.01% compared to the existing fastest method APF. The code for GWPF is available at <span><span>https://github.com/Dora233/GWPF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"255 ","pages":"Article 110886"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007187","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Communication bottleneck is a critical challenge in federated learning. While parameter freezing has emerged as a popular approach, utilizing fine-grained parameters as aggregation objects, existing methods suffer from issues such as a lack of thawing strategy, lag and inflexibility in the thawing process, and underutilization of frozen parameters’ updates. To address these challenges, we propose Gradient-Wise Parameter Freezing (GWPF), a mechanism that wisely controls frozen periods for different parameters through parameter freezing and thawing strategies. GWPF globally freezes parameters with insignificant gradients and excludes frozen parameters from global updates during the frozen period, reducing communication overhead and accelerating training. The thawing strategy, based on global decisions by the server and collaboration with clients, leverages real-time feedback on the locally accumulated gradients of frozen parameters in each round, achieving a balanced approach between mitigating communication and enhancing model accuracy. We provide theoretical analysis and a convergence guarantee for non-convex objectives. Extensive experiments confirm that our mechanism achieves a speedup of up to 4.52 times in time-to-accuracy performance and reduces communication overhead by up to 48.73%. It also improves final model accuracy by up to 2.01% compared to the existing fastest method APF. The code for GWPF is available at https://github.com/Dora233/GWPF.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.