SCDFL：一种基于谱聚类的分散联邦学习加速收敛框架

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-08-14 DOI:10.1016/j.comnet.2025.111615

Faisal Alshami , Lin Yao , Xin Wang , Guowei Wu

{"title":"SCDFL：一种基于谱聚类的分散联邦学习加速收敛框架","authors":"Faisal Alshami , Lin Yao , Xin Wang , Guowei Wu","doi":"10.1016/j.comnet.2025.111615","DOIUrl":null,"url":null,"abstract":"<div><div>Decentralized Federated Learning (DFL) is a popular distributed machine learning framework that facilitates collaboration among multiple clients without dependence on a central server to develop a global model. This architecture faces issues with client convergence, resulting in network congestion and slower convergence during the DFL process. These challenges stem from various communications topologies and the non-independent and non-identically distributed nature of data on terminal devices in real-world scenarios, which affect both model convergence speed and overall terminal performance. Therefore, we propose SCDFL, a federated learning framework that leverages spectral clustering to efficiently and scalably handle client data heterogeneity. SCDFL introduces a novel spectral clustering strategy that focuses on grouping clients based on their characteristics. Key components include reducing the dimensionality of the client data by incremental PCA, which includes high-dimensional model updates or feature vectors, making the clustering process more efficient. Then, a similarity matrix based on the reduced data will be computed to measure client similarity. Utilizing this matrix, we apply spectral clustering to group clients with similar data characteristics. Finally, we apply the aggregation in intra-cluster and inter-cluster to the updated global model. Extensive experiments have been conducted across different topologies, and the results demonstrate that SCDFL achieves higher accuracy, faster convergence, reduced communication overhead, and improved generalization, particularly on complex datasets like MNIST, CIFAR10, and CIFAR100, while efficiently handling data heterogeneity and optimizing resource utilization across various network topologies.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"271 ","pages":"Article 111615"},"PeriodicalIF":4.6000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SCDFL: A Spectral Clustering-based framework for accelerating convergence in Decentralized Federated Learning\",\"authors\":\"Faisal Alshami , Lin Yao , Xin Wang , Guowei Wu\",\"doi\":\"10.1016/j.comnet.2025.111615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Decentralized Federated Learning (DFL) is a popular distributed machine learning framework that facilitates collaboration among multiple clients without dependence on a central server to develop a global model. This architecture faces issues with client convergence, resulting in network congestion and slower convergence during the DFL process. These challenges stem from various communications topologies and the non-independent and non-identically distributed nature of data on terminal devices in real-world scenarios, which affect both model convergence speed and overall terminal performance. Therefore, we propose SCDFL, a federated learning framework that leverages spectral clustering to efficiently and scalably handle client data heterogeneity. SCDFL introduces a novel spectral clustering strategy that focuses on grouping clients based on their characteristics. Key components include reducing the dimensionality of the client data by incremental PCA, which includes high-dimensional model updates or feature vectors, making the clustering process more efficient. Then, a similarity matrix based on the reduced data will be computed to measure client similarity. Utilizing this matrix, we apply spectral clustering to group clients with similar data characteristics. Finally, we apply the aggregation in intra-cluster and inter-cluster to the updated global model. Extensive experiments have been conducted across different topologies, and the results demonstrate that SCDFL achieves higher accuracy, faster convergence, reduced communication overhead, and improved generalization, particularly on complex datasets like MNIST, CIFAR10, and CIFAR100, while efficiently handling data heterogeneity and optimizing resource utilization across various network topologies.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"271 \",\"pages\":\"Article 111615\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625005821\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625005821","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

分散式联邦学习（DFL）是一种流行的分布式机器学习框架，它促进了多个客户端之间的协作，而不依赖于中央服务器来开发全局模型。这种体系结构面临客户端收敛的问题，导致DFL过程中的网络拥塞和较慢的收敛速度。这些挑战源于各种通信拓扑结构，以及现实场景中终端设备上数据的非独立和非相同分布性质，这些都会影响模型收敛速度和整体终端性能。因此，我们提出了SCDFL，这是一个联邦学习框架，利用谱聚类来有效和可扩展地处理客户端数据异质性。SCDFL引入了一种新的频谱聚类策略，该策略侧重于根据客户端特征对其进行分组。关键组件包括通过增量PCA降低客户端数据的维数，其中包括高维模型更新或特征向量，使聚类过程更加高效。然后，基于简化后的数据计算相似度矩阵来度量客户端相似度。利用该矩阵，我们应用谱聚类对具有相似数据特征的客户端进行分组。最后，我们将集群内和集群间的聚合应用到更新的全局模型中。在不同的拓扑结构中进行了大量的实验，结果表明SCDFL实现了更高的精度、更快的收敛速度、更少的通信开销和更好的泛化，特别是在MNIST、CIFAR10和CIFAR100等复杂数据集上，同时有效地处理数据异构并优化各种网络拓扑结构的资源利用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SCDFL: A Spectral Clustering-based framework for accelerating convergence in Decentralized Federated Learning

Decentralized Federated Learning (DFL) is a popular distributed machine learning framework that facilitates collaboration among multiple clients without dependence on a central server to develop a global model. This architecture faces issues with client convergence, resulting in network congestion and slower convergence during the DFL process. These challenges stem from various communications topologies and the non-independent and non-identically distributed nature of data on terminal devices in real-world scenarios, which affect both model convergence speed and overall terminal performance. Therefore, we propose SCDFL, a federated learning framework that leverages spectral clustering to efficiently and scalably handle client data heterogeneity. SCDFL introduces a novel spectral clustering strategy that focuses on grouping clients based on their characteristics. Key components include reducing the dimensionality of the client data by incremental PCA, which includes high-dimensional model updates or feature vectors, making the clustering process more efficient. Then, a similarity matrix based on the reduced data will be computed to measure client similarity. Utilizing this matrix, we apply spectral clustering to group clients with similar data characteristics. Finally, we apply the aggregation in intra-cluster and inter-cluster to the updated global model. Extensive experiments have been conducted across different topologies, and the results demonstrate that SCDFL achieves higher accuracy, faster convergence, reduced communication overhead, and improved generalization, particularly on complex datasets like MNIST, CIFAR10, and CIFAR100, while efficiently handling data heterogeneity and optimizing resource utilization across various network topologies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.