Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices

IF 9.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Jun Liu;Yunming Liao;Hongli Xu;Yang Xu;Jianchun Liu;Chen Qian
{"title":"Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices","authors":"Jun Liu;Yunming Liao;Hongli Xu;Yang Xu;Jianchun Liu;Chen Qian","doi":"10.1109/TMC.2025.3586644","DOIUrl":null,"url":null,"abstract":"Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA)<sup>1</sup>, but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8× and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 11","pages":"12533-12549"},"PeriodicalIF":9.2000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11072379/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA)1, but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8× and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.
异构设备的自适应参数高效联邦微调
联邦微调(Federated fine-tuning, FedFT)被提出以分布式的方式对预训练好的语言模型进行微调。然而,在实际应用中,有效的FedFT存在两个关键挑战,即资源约束和系统异质性。现有的工作依赖于参数高效的微调方法,如低秩自适应(LoRA)1,但存在很大的局限性。在这里,基于FedFT的固有特征,我们观察到在接近输出的地方添加更高秩的LoRA层有助于节省资源消耗,同时获得相当的微调性能。然后,我们提出了一种新的基于LoRA的FedFT框架,称为LEGEND,该框架面临确定LoRA层的数量(称为LoRA深度)和每个LoRA层的秩(称为秩分布)的困难。分析了LoRA深度与rank分布之间的耦合关系,设计了一种高效的异构设备LoRA配置算法,从而提高了微调效率。在80台商用设备的物理平台上进行了广泛的实验。结果表明,在达到目标精度的情况下,与先进的解决方案相比,LEGEND可以实现1.5-2.8倍的加速,节省约42.3%的通信成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Mobile Computing
IEEE Transactions on Mobile Computing 工程技术-电信学
CiteScore
12.90
自引率
2.50%
发文量
403
审稿时长
6.6 months
期刊介绍: IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信