SnapCFL：一个基于预聚类的数据和系统异构聚类联邦学习框架

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2025-01-15 DOI:10.1109/TMC.2025.3529487

Yujun Cheng;Weiting Zhang;Zhewei Zhang;Jiawen Kang;Qi Xu;Shengjin Wang;Dusit Niyato

{"title":"SnapCFL：一个基于预聚类的数据和系统异构聚类联邦学习框架","authors":"Yujun Cheng;Weiting Zhang;Zhewei Zhang;Jiawen Kang;Qi Xu;Shengjin Wang;Dusit Niyato","doi":"10.1109/TMC.2025.3529487","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%<inline-formula><tex-math>$\\sim$</tex-math></inline-formula>36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 6","pages":"5214-5228"},"PeriodicalIF":7.7000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SnapCFL: A Pre-Clustering-Based Clustered Federated Learning Framework for Data and System Heterogeneities\",\"authors\":\"Yujun Cheng;Weiting Zhang;Zhewei Zhang;Jiawen Kang;Qi Xu;Shengjin Wang;Dusit Niyato\",\"doi\":\"10.1109/TMC.2025.3529487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%<inline-formula><tex-math>$\\\\sim$</tex-math></inline-formula>36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"24 6\",\"pages\":\"5214-5228\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10839634/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839634/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

与传统的机器学习（ML）相比，联邦学习（FL）已经成为解决与移动设备相关的数据隐私问题的一个有前途的框架。然而，由于不同客户端的异构性，传统的FL面临着巨大的挑战。聚类联邦学习（CFL）在缓解数据异构挑战方面已经证明了其有效性，但这极大地限制了CFL的广泛应用。然而，现有的CFL方法通常将聚类过程与主FL过程紧密耦合，从而影响了CFL的灵活性和性能。在本文中，我们提出了一种基于预聚类的CFL方法，称为SnapCFL，该方法将CFL过程解耦为预聚类和主要的CFL阶段，同时考虑了异质性对CFL准确性的影响和框架的灵活性。预聚类阶段将数据相似性度量建模为双样本假设检验问题，以更准确地对客户进行分组并缓解数据异质性。在主FL阶段，采用基于约束的客户端选择方法来解决系统异构问题。我们使用具有各种异质性设置的流行数据集进行了广泛的实验。结果表明，SnapCFL在准确率和效率方面都取得了优异的成绩。与其他五种最先进的方法相比，SnapCFL可以将模型精度提高0.7%，并在至少0.08倍的收敛时间内达到相同的精度水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SnapCFL: A Pre-Clustering-Based Clustered Federated Learning Framework for Data and System Heterogeneities

Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%

$\sim$

36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.