发布求助

文献互助智能选刊最新文献

Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices

IF 9.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2025-07-07 DOI:10.1109/TMC.2025.3586644

Jun Liu;Yunming Liao;Hongli Xu;Yang Xu;Jianchun Liu;Chen Qian

{"title":"Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices","authors":"Jun Liu;Yunming Liao;Hongli Xu;Yang Xu;Jianchun Liu;Chen Qian","doi":"10.1109/TMC.2025.3586644","DOIUrl":null,"url":null,"abstract":"Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA)<sup>1</sup>, but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8× and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 11","pages":"12533-12549"},"PeriodicalIF":9.2000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11072379/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA)¹, but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8× and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.

查看原文本刊更多论文

异构设备的自适应参数高效联邦微调

联邦微调（Federated fine-tuning, FedFT）被提出以分布式的方式对预训练好的语言模型进行微调。然而，在实际应用中，有效的FedFT存在两个关键挑战，即资源约束和系统异质性。现有的工作依赖于参数高效的微调方法，如低秩自适应（LoRA）1，但存在很大的局限性。在这里，基于FedFT的固有特征，我们观察到在接近输出的地方添加更高秩的LoRA层有助于节省资源消耗，同时获得相当的微调性能。然后，我们提出了一种新的基于LoRA的FedFT框架，称为LEGEND，该框架面临确定LoRA层的数量（称为LoRA深度）和每个LoRA层的秩（称为秩分布）的困难。分析了LoRA深度与rank分布之间的耦合关系，设计了一种高效的异构设备LoRA配置算法，从而提高了微调效率。在80台商用设备的物理平台上进行了广泛的实验。结果表明，在达到目标精度的情况下，与先进的解决方案相比，LEGEND可以实现1.5-2.8倍的加速，节省约42.3%的通信成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Mobile Computing

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.

联系我们：info@booksci.cn Book学术提供免费学术资源搜索服务，方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1

京公网安备 11010802042870号

Book学术文献互助

Book学术文献互助群
群号：604180095

Book学术官方微信