FedMQ+:通过多粒度量化实现高效的异构联邦学习

IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Mei Cao , Huiyu Wang , Yuan Yuan , Jianbo Lu , Xiaojun Cai , Dongxiao Yu , Mengying Zhao
{"title":"FedMQ+:通过多粒度量化实现高效的异构联邦学习","authors":"Mei Cao ,&nbsp;Huiyu Wang ,&nbsp;Yuan Yuan ,&nbsp;Jianbo Lu ,&nbsp;Xiaojun Cai ,&nbsp;Dongxiao Yu ,&nbsp;Mengying Zhao","doi":"10.1016/j.sysarc.2025.103460","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>T</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"167 ","pages":"Article 103460"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedMQ+: Towards efficient heterogeneous federated learning with multi-grained quantization\",\"authors\":\"Mei Cao ,&nbsp;Huiyu Wang ,&nbsp;Yuan Yuan ,&nbsp;Jianbo Lu ,&nbsp;Xiaojun Cai ,&nbsp;Dongxiao Yu ,&nbsp;Mengying Zhao\",\"doi\":\"10.1016/j.sysarc.2025.103460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>T</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.</div></div>\",\"PeriodicalId\":50027,\"journal\":{\"name\":\"Journal of Systems Architecture\",\"volume\":\"167 \",\"pages\":\"Article 103460\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems Architecture\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1383762125001328\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125001328","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

联邦学习(FL)是一种分布式机器学习范式,它支持协作模型训练,同时保护客户端数据隐私。尽管这种方法显著增强了数据隐私,但它带来了大量的通信开销。量化技术通过将模型参数压缩到更少的比特来缓解这一挑战。然而,传统的量化方法主要是在客户层面实现的,往往忽略了不同模型参数的异构重要性。在我们之前的工作FedMQ中,我们在客户间和客户内部层面探索了量化机制,以提高沟通效率。然而,由于模型的聚合和有限的表达能力,这种方法损害了模型的准确性。缺乏全面的理论分析和数学验证。在本文中,我们提出了FedMQ+,一个改进的异构联邦学习框架,增强去量化,以优化全局模型性能。首先,我们设计了一种基于正态函数的精确去量化策略,从给定的低精度权值精确重构出全精度权值。接下来,我们对FedMQ+进行了严格的理论分析,建立了其收敛性的上界,并从数学上证明了它的0 (1/T)收敛率。最后,我们在不同的数据集和模型上进行了广泛的实验。实验结果表明,FedMQ+的收敛速度从3.1%提高到85.8%,同时保持了相当的模型精度,并且与最先进的基线相比具有更高的通信效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
FedMQ+: Towards efficient heterogeneous federated learning with multi-grained quantization
Federated Learning (FL) is a distributed machine learning paradigm that enables collaborative model training while preserving client data privacy. Although this approach significantly enhances data privacy, it introduces substantial communication overhead. Quantization techniques mitigate this challenge by compressing model parameters into fewer bits. However, traditional quantization methods are primarily implemented at the client level, often overlooking the heterogeneous importance of distinct model parameters. In our previous work, FedMQ, we explored quantization mechanisms at both inter-client and intra-client levels to improve communication efficiency. Nevertheless, this approach compromises model accuracy due to the aggregation of models with limited expressive capability. Additionally, it lacks comprehensive theoretical analysis and mathematical verification. In this paper, we propose FedMQ+, an improved framework for heterogeneous federated learning with enhanced dequantization, to optimize global model performance. First, we design a precise dequantization strategy based on normal functions to accurately reconstruct full-precision weights from the given low-precision weights. Next, we conduct a rigorous theoretical analysis of the FedMQ+, establish an upper bound for its convergence, and mathematically demonstrate its O(1/T) convergence rate. Finally, we perform extensive experiments across diverse datasets and models. Experimental results demonstrate that FedMQ+ achieves significant improvements in convergence speed from 3.1% to 85.8%, while maintaining comparable model accuracy and achieving superior communication efficiency compared with state-of-the-art baselines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Systems Architecture
Journal of Systems Architecture 工程技术-计算机:硬件
CiteScore
8.70
自引率
15.60%
发文量
226
审稿时长
46 days
期刊介绍: The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信