FedMQ+：通过多粒度量化实现高效的异构联邦学习

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Systems Architecture Pub Date : 2025-05-30 DOI:10.1016/j.sysarc.2025.103460

Mei Cao , Huiyu Wang , Yuan Yuan , Jianbo Lu , Xiaojun Cai , Dongxiao Yu , Mengying Zhao

{"title":"FedMQ+：通过多粒度量化实现高效的异构联邦学习","authors":"Mei Cao , Huiyu Wang , Yuan Yuan , Jianbo Lu , Xiaojun Cai , Dongxiao Yu , Mengying Zhao","doi":"10.1016/j.sysarc.2025.103460","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>T</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"167 ","pages":"Article 103460"},"PeriodicalIF":4.1000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedMQ+: Towards efficient heterogeneous federated learning with multi-grained quantization\",\"authors\":\"Mei Cao , Huiyu Wang , Yuan Yuan , Jianbo Lu , Xiaojun Cai , Dongxiao Yu , Mengying Zhao\",\"doi\":\"10.1016/j.sysarc.2025.103460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>T</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.</div></div>\",\"PeriodicalId\":50027,\"journal\":{\"name\":\"Journal of Systems Architecture\",\"volume\":\"167 \",\"pages\":\"Article 103460\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems Architecture\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1383762125001328\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125001328","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

联邦学习（FL）是一种分布式机器学习范式，它支持协作模型训练，同时保护客户端数据隐私。尽管这种方法显著增强了数据隐私，但它带来了大量的通信开销。量化技术通过将模型参数压缩到更少的比特来缓解这一挑战。然而，传统的量化方法主要是在客户层面实现的，往往忽略了不同模型参数的异构重要性。在我们之前的工作FedMQ中，我们在客户间和客户内部层面探索了量化机制，以提高沟通效率。然而，由于模型的聚合和有限的表达能力，这种方法损害了模型的准确性。缺乏全面的理论分析和数学验证。在本文中，我们提出了FedMQ+，一个改进的异构联邦学习框架，增强去量化，以优化全局模型性能。首先，我们设计了一种基于正态函数的精确去量化策略，从给定的低精度权值精确重构出全精度权值。接下来，我们对FedMQ+进行了严格的理论分析，建立了其收敛性的上界，并从数学上证明了它的0 （1/T）收敛率。最后，我们在不同的数据集和模型上进行了广泛的实验。实验结果表明，FedMQ+的收敛速度从3.1%提高到85.8%，同时保持了相当的模型精度，并且与最先进的基线相比具有更高的通信效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FedMQ+: Towards efficient heterogeneous federated learning with multi-grained quantization

Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its

O (1 / T)

convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Systems Architecture 工程技术-计算机：硬件

CiteScore

8.70

自引率

15.60%

发文量

226

审稿时长

46 days

期刊介绍： The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.