非iid数据的分布正则化联邦学习

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI:10.1109/ICDE55515.2023.00164

Yansheng Wang, Yongxin Tong, Zimu Zhou, Ruisheng Zhang, Sinno Jialin Pan, Lixin Fan, Qiang Yang

{"title":"非iid数据的分布正则化联邦学习","authors":"Yansheng Wang, Yongxin Tong, Zimu Zhou, Ruisheng Zhang, Sinno Jialin Pan, Lixin Fan, Qiang Yang","doi":"10.1109/ICDE55515.2023.00164","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has emerged as a popular machine learning paradigm recently. Compared with traditional distributed learning, its unique challenges mainly lie in communication efficiency and non-IID (heterogeneous data) problem. While the widely adopted framework FedAvg can reduce communication overhead significantly, its effectiveness on non-IID data still lacks exploration. In this paper, we study the non-IID problem of FL from the perspective of domain adaptation. We propose a distribution regularization for FL on non-IID data such that the discrepancy of data distributions between clients is reduced. To further reduce the communication cost, we devise two novel distributed learning algorithms, namely rFedAvg and rFedAvg+, for efficiently learning with the distribution regularization. More importantly, we theoretically establish their convergence for strongly convex objectives. Extensive experiments on 4 datasets with both CNN and LSTM as learning models verify the effectiveness and efficiency of the proposed algorithms.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distribution-Regularized Federated Learning on Non-IID Data\",\"authors\":\"Yansheng Wang, Yongxin Tong, Zimu Zhou, Ruisheng Zhang, Sinno Jialin Pan, Lixin Fan, Qiang Yang\",\"doi\":\"10.1109/ICDE55515.2023.00164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) has emerged as a popular machine learning paradigm recently. Compared with traditional distributed learning, its unique challenges mainly lie in communication efficiency and non-IID (heterogeneous data) problem. While the widely adopted framework FedAvg can reduce communication overhead significantly, its effectiveness on non-IID data still lacks exploration. In this paper, we study the non-IID problem of FL from the perspective of domain adaptation. We propose a distribution regularization for FL on non-IID data such that the discrepancy of data distributions between clients is reduced. To further reduce the communication cost, we devise two novel distributed learning algorithms, namely rFedAvg and rFedAvg+, for efficiently learning with the distribution regularization. More importantly, we theoretically establish their convergence for strongly convex objectives. Extensive experiments on 4 datasets with both CNN and LSTM as learning models verify the effectiveness and efficiency of the proposed algorithms.\",\"PeriodicalId\":434744,\"journal\":{\"name\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE55515.2023.00164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

联邦学习(FL)最近成为一种流行的机器学习范式。与传统的分布式学习相比，其特有的挑战主要在于通信效率和非异构数据问题。虽然广泛采用的框架fedag可以显著降低通信开销，但其在非iid数据上的有效性还缺乏探索。本文从域自适应的角度研究了FL的非iid问题。我们提出了一种非iid数据的FL分布正则化方法，以减少客户端之间数据分布的差异。为了进一步降低通信成本，我们设计了两种新的分布式学习算法rFedAvg和rFedAvg+，通过分布正则化进行高效学习。更重要的是，我们从理论上证明了它们对于强凸目标的收敛性。用CNN和LSTM作为学习模型在4个数据集上进行了大量实验，验证了所提算法的有效性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distribution-Regularized Federated Learning on Non-IID Data

Federated learning (FL) has emerged as a popular machine learning paradigm recently. Compared with traditional distributed learning, its unique challenges mainly lie in communication efficiency and non-IID (heterogeneous data) problem. While the widely adopted framework FedAvg can reduce communication overhead significantly, its effectiveness on non-IID data still lacks exploration. In this paper, we study the non-IID problem of FL from the perspective of domain adaptation. We propose a distribution regularization for FL on non-IID data such that the discrepancy of data distributions between clients is reduced. To further reduce the communication cost, we devise two novel distributed learning algorithms, namely rFedAvg and rFedAvg+, for efficiently learning with the distribution regularization. More importantly, we theoretically establish their convergence for strongly convex objectives. Extensive experiments on 4 datasets with both CNN and LSTM as learning models verify the effectiveness and efficiency of the proposed algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE 39th International Conference on Data Engineering (ICDE)

自引率

0.00%

发文量