Fed-HeLLo：异构LoRA分配的高效联邦基础模型微调。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-10-01 DOI:10.1109/TNNLS.2025.3580495

Zikai Zhang, Ping Liu, Jiahao Xu, Rui Hu

{"title":"Fed-HeLLo：异构LoRA分配的高效联邦基础模型微调。","authors":"Zikai Zhang, Ping Liu, Jiahao Xu, Rui Hu","doi":"10.1109/TNNLS.2025.3580495","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":"17556-17569"},"PeriodicalIF":8.9000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning With Heterogeneous LoRA Allocation.\",\"authors\":\"Zikai Zhang, Ping Liu, Jiahao Xu, Rui Hu\",\"doi\":\"10.1109/TNNLS.2025.3580495\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"PP \",\"pages\":\"17556-17569\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TNNLS.2025.3580495\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2025.3580495","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

联邦学习（FL）最近被用于跨多个客户机协作微调基础模型（fm）。值得注意的是，基于联邦低秩自适应（LoRA）的微调方法最近引起了人们的注意，它允许客户端使用局部可训练参数的一小部分对fm进行微调。然而，现有的大多数方法没有考虑到客户端资源的异质性，或者缺乏有效的局部训练策略来最大化有限资源下的全局微调性能。在这项工作中，我们提出了基于异构LoRA分配的联邦LoRA微调框架（Fed-HeLLo），这是一种新颖的基于联邦LoRA的微调框架，它使客户端能够协作地对具有不同本地可训练LoRA层的FM进行微调。为了保证其有效性，我们开发了几种异构LoRA分配策略，根据客户端的资源能力和层的重要性自适应地分配本地可训练的LoRA层。具体而言，基于动态层重要性，我们设计了一个利用动态梯度范数信息的基于Fisher信息矩阵评分的HLA （FIM-HLA）。为了更好地稳定训练过程，我们考虑了LoRA层的内在重要性，并设计了一个几何定义的HLA （GD-HLA）策略。它将可训练的LoRA层的集体分布塑造成特定的几何模式，如三角形、倒三角、瓶颈和均匀。此外，我们将GD-HLA扩展为随机版本，命名为随机GD-HLA (RGD-HLA)，以提高模型的随机性准确性。通过共同设计所提出的HLA策略，我们将动态层和内在层的重要性结合到HLA策略的设计中。为了彻底评估我们的方法，我们使用五个数据集和三个级别的数据分布（从独立同分布到极端非同分布）模拟了各种复杂的基于lora的联邦微调设置。实验结果证明了采用HLA策略的Fed-HeLLo的有效性和高效性。代码可在https://github.com/ TNI-playground/Fed_HeLLo上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fed-HeLLo: Efficient Federated Foundation Model Fine-Tuning With Heterogeneous LoRA Allocation.

Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.