Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning With Heterogeneous LoRA Allocation.

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zikai Zhang, Ping Liu, Jiahao Xu, Rui Hu
{"title":"Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning With Heterogeneous LoRA Allocation.","authors":"Zikai Zhang, Ping Liu, Jiahao Xu, Rui Hu","doi":"10.1109/TNNLS.2025.3580495","DOIUrl":null,"url":null,"abstract":"<p><p>Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":"17556-17569"},"PeriodicalIF":8.9000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2025.3580495","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.

Fed-HeLLo:异构LoRA分配的高效联邦基础模型微调。
联邦学习(FL)最近被用于跨多个客户机协作微调基础模型(fm)。值得注意的是,基于联邦低秩自适应(LoRA)的微调方法最近引起了人们的注意,它允许客户端使用局部可训练参数的一小部分对fm进行微调。然而,现有的大多数方法没有考虑到客户端资源的异质性,或者缺乏有效的局部训练策略来最大化有限资源下的全局微调性能。在这项工作中,我们提出了基于异构LoRA分配的联邦LoRA微调框架(Fed-HeLLo),这是一种新颖的基于联邦LoRA的微调框架,它使客户端能够协作地对具有不同本地可训练LoRA层的FM进行微调。为了保证其有效性,我们开发了几种异构LoRA分配策略,根据客户端的资源能力和层的重要性自适应地分配本地可训练的LoRA层。具体而言,基于动态层重要性,我们设计了一个利用动态梯度范数信息的基于Fisher信息矩阵评分的HLA (FIM-HLA)。为了更好地稳定训练过程,我们考虑了LoRA层的内在重要性,并设计了一个几何定义的HLA (GD-HLA)策略。它将可训练的LoRA层的集体分布塑造成特定的几何模式,如三角形、倒三角、瓶颈和均匀。此外,我们将GD-HLA扩展为随机版本,命名为随机GD-HLA (RGD-HLA),以提高模型的随机性准确性。通过共同设计所提出的HLA策略,我们将动态层和内在层的重要性结合到HLA策略的设计中。为了彻底评估我们的方法,我们使用五个数据集和三个级别的数据分布(从独立同分布到极端非同分布)模拟了各种复杂的基于lora的联邦微调设置。实验结果证明了采用HLA策略的Fed-HeLLo的有效性和高效性。代码可在https://github.com/ TNI-playground/Fed_HeLLo上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信