{"title":"Distribution-Regularized Federated Learning on Non-IID Data","authors":"Yansheng Wang, Yongxin Tong, Zimu Zhou, Ruisheng Zhang, Sinno Jialin Pan, Lixin Fan, Qiang Yang","doi":"10.1109/ICDE55515.2023.00164","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has emerged as a popular machine learning paradigm recently. Compared with traditional distributed learning, its unique challenges mainly lie in communication efficiency and non-IID (heterogeneous data) problem. While the widely adopted framework FedAvg can reduce communication overhead significantly, its effectiveness on non-IID data still lacks exploration. In this paper, we study the non-IID problem of FL from the perspective of domain adaptation. We propose a distribution regularization for FL on non-IID data such that the discrepancy of data distributions between clients is reduced. To further reduce the communication cost, we devise two novel distributed learning algorithms, namely rFedAvg and rFedAvg+, for efficiently learning with the distribution regularization. More importantly, we theoretically establish their convergence for strongly convex objectives. Extensive experiments on 4 datasets with both CNN and LSTM as learning models verify the effectiveness and efficiency of the proposed algorithms.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) has emerged as a popular machine learning paradigm recently. Compared with traditional distributed learning, its unique challenges mainly lie in communication efficiency and non-IID (heterogeneous data) problem. While the widely adopted framework FedAvg can reduce communication overhead significantly, its effectiveness on non-IID data still lacks exploration. In this paper, we study the non-IID problem of FL from the perspective of domain adaptation. We propose a distribution regularization for FL on non-IID data such that the discrepancy of data distributions between clients is reduced. To further reduce the communication cost, we devise two novel distributed learning algorithms, namely rFedAvg and rFedAvg+, for efficiently learning with the distribution regularization. More importantly, we theoretically establish their convergence for strongly convex objectives. Extensive experiments on 4 datasets with both CNN and LSTM as learning models verify the effectiveness and efficiency of the proposed algorithms.