{"title":"SnapCFL:一个基于预聚类的数据和系统异构聚类联邦学习框架","authors":"Yujun Cheng;Weiting Zhang;Zhewei Zhang;Jiawen Kang;Qi Xu;Shengjin Wang;Dusit Niyato","doi":"10.1109/TMC.2025.3529487","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%<inline-formula><tex-math>$\\sim$</tex-math></inline-formula>36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 6","pages":"5214-5228"},"PeriodicalIF":7.7000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SnapCFL: A Pre-Clustering-Based Clustered Federated Learning Framework for Data and System Heterogeneities\",\"authors\":\"Yujun Cheng;Weiting Zhang;Zhewei Zhang;Jiawen Kang;Qi Xu;Shengjin Wang;Dusit Niyato\",\"doi\":\"10.1109/TMC.2025.3529487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%<inline-formula><tex-math>$\\\\sim$</tex-math></inline-formula>36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.\",\"PeriodicalId\":50389,\"journal\":{\"name\":\"IEEE Transactions on Mobile Computing\",\"volume\":\"24 6\",\"pages\":\"5214-5228\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Mobile Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10839634/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839634/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
SnapCFL: A Pre-Clustering-Based Clustered Federated Learning Framework for Data and System Heterogeneities
Federated Learning (FL) has emerged as a promising framework to address data privacy concerns associated with mobile devices, in contrast to conventional Machine Learning (ML). However, traditional FL encounters significant challenges due to the heterogeneities among different clients. Clustered Federated Learning (CFL) has demonstrated effectiveness in mitigating the data heterogeneity challenge, which significantly limits a broader application of FL. Nevertheless, existing CFL approaches often tightly couple the clustering process with the main FL process, affecting the flexibility and performance of CFL. In this paper, we propose a pre-clustering-based CFL approach, named SnapCFL, which decouples the CFL process into pre-clustering and main FL stages, considering both the impact of heterogeneity on CFL accuracy and the framework's flexibility. The pre-clustering stage models the measurement of data similarity as a two-sample hypothesis testing problem to more accurately group clients and alleviate data heterogeneity. In the main FL stage, a constraint-based client selection method is employed to address the system heterogeneity problem. We conduct extensive experiments using popular datasets with various heterogeneity settings. The results demonstrate that SnapCFL achieves excellent performance in terms of accuracy and efficiency. Compared to five other state-of-the-art approaches, SnapCFL can improve model accuracy by 0.7%$\sim$36.4%, and achieve the same level of accuracy with at least 0.08× the convergence time.
期刊介绍:
IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.