FedDDB:基于数据分布差异的聚类联邦学习

Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence Pub Date : 2022-12-23 DOI:10.1145/3579654.3579732

Chengyu You, Zihao Lu, Junli Wang, Chungang Yan

{"title":"FedDDB:基于数据分布差异的聚类联邦学习","authors":"Chengyu You, Zihao Lu, Junli Wang, Chungang Yan","doi":"10.1145/3579654.3579732","DOIUrl":null,"url":null,"abstract":"Clustered federated learning is a federated learning method based on multi-task learning. It groups similar clients into the same clusters and shares model parameters to solve the problem that the joint model is trapped in local optima on Non-IID data. Most of the existing clustered federated learning methods are based on the difference of model parameters for clients clustering. During the client model training process, the model parameters are biased and the clustering result is affected due to insufficient samples and missing eigenvalues in the dataset. In this paper, we develop a clustered federated learning method based on data distribution difference (FedDDB) in the dataset level. The method in this paper focuses on the distribution of label probability and eigenvalues, analyzes the difference of data distribution difference between clients and measures the distance between datasets which is used for client clustering. Every cluster will be trained independently and in parallel on the cluster center model. At the beginning of each round of training, the client clustering process needs to be repeated. We conduct relevant experiments and demonstrate the effectiveness of our method.","PeriodicalId":146783,"journal":{"name":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedDDB: Clustered Federated Learning based on Data Distribution Difference\",\"authors\":\"Chengyu You, Zihao Lu, Junli Wang, Chungang Yan\",\"doi\":\"10.1145/3579654.3579732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustered federated learning is a federated learning method based on multi-task learning. It groups similar clients into the same clusters and shares model parameters to solve the problem that the joint model is trapped in local optima on Non-IID data. Most of the existing clustered federated learning methods are based on the difference of model parameters for clients clustering. During the client model training process, the model parameters are biased and the clustering result is affected due to insufficient samples and missing eigenvalues in the dataset. In this paper, we develop a clustered federated learning method based on data distribution difference (FedDDB) in the dataset level. The method in this paper focuses on the distribution of label probability and eigenvalues, analyzes the difference of data distribution difference between clients and measures the distance between datasets which is used for client clustering. Every cluster will be trained independently and in parallel on the cluster center model. At the beginning of each round of training, the client clustering process needs to be repeated. We conduct relevant experiments and demonstrate the effectiveness of our method.\",\"PeriodicalId\":146783,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579654.3579732\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579654.3579732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

聚类联邦学习是一种基于多任务学习的联邦学习方法。它将相似的客户端分组到相同的聚类中，并共享模型参数，以解决联合模型在非iid数据上陷入局部最优的问题。现有的聚类联邦学习方法大多是基于客户端聚类模型参数的差异。在客户端模型训练过程中，由于数据集中样本不足和特征值缺失，模型参数存在偏差，影响聚类结果。本文提出了一种基于数据分布差(FedDDB)的聚类联邦学习方法。该方法主要关注标签概率和特征值的分布，分析客户端数据分布差异，测量数据集之间的距离，用于客户端聚类。每个集群将在集群中心模型上独立并行地进行训练。在每一轮训练开始时，需要重复客户端聚类过程。我们进行了相关的实验，证明了我们的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FedDDB: Clustered Federated Learning based on Data Distribution Difference

Clustered federated learning is a federated learning method based on multi-task learning. It groups similar clients into the same clusters and shares model parameters to solve the problem that the joint model is trapped in local optima on Non-IID data. Most of the existing clustered federated learning methods are based on the difference of model parameters for clients clustering. During the client model training process, the model parameters are biased and the clustering result is affected due to insufficient samples and missing eigenvalues in the dataset. In this paper, we develop a clustered federated learning method based on data distribution difference (FedDDB) in the dataset level. The method in this paper focuses on the distribution of label probability and eigenvalues, analyzes the difference of data distribution difference between clients and measures the distance between datasets which is used for client clustering. Every cluster will be trained independently and in parallel on the cluster center model. At the beginning of each round of training, the client clustering process needs to be repeated. We conduct relevant experiments and demonstrate the effectiveness of our method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence

自引率

0.00%

发文量