用于LoRA自适应微调的元学习缩放

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Dan Luo , Kangfeng Zheng , Chunhua Wu , Xiujuan Wang
{"title":"用于LoRA自适应微调的元学习缩放","authors":"Dan Luo ,&nbsp;Kangfeng Zheng ,&nbsp;Chunhua Wu ,&nbsp;Xiujuan Wang","doi":"10.1016/j.neucom.2025.130374","DOIUrl":null,"url":null,"abstract":"<div><div>Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"643 ","pages":"Article 130374"},"PeriodicalIF":5.5000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSLoRA: Meta-learned scaling for adaptive fine-tuning of LoRA\",\"authors\":\"Dan Luo ,&nbsp;Kangfeng Zheng ,&nbsp;Chunhua Wu ,&nbsp;Xiujuan Wang\",\"doi\":\"10.1016/j.neucom.2025.130374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"643 \",\"pages\":\"Article 130374\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092523122501046X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092523122501046X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

低秩自适应(LoRA)方法在大型模型的有效微调方面表现出了很强的能力。然而,现有的基于lora的方法通常需要手动设置缩放因子,这个过程需要大量的搜索工作来找到最优值。为了应对这一挑战,我们首先开发了数据驱动的启发式方法,通过前向传播期间的激活模式分析或后向更新期间的梯度行为监控,自动确定分层缩放因子。但在实际应用中,其性能仍不理想。在这些理论基础上,我们提出了MSLoRA,这是一个新的框架,将比例因子确定重新表述为参数有效微调中的动态优化问题。我们的方法创新地将缩放因子建模为自适应元参数,其最优值通过变压器体系结构层次和特定任务学习目标之间的相互作用有机地出现。在自然语言理解和生成任务中进行的大量实验表明,MSLoRA始终优于基线模型。这突出了MSLoRA的动态、特定于层的调整机制在捕获特定于任务的激活模式的复杂性方面的有效性,使其成为一个更健壮和可扩展的解决方案,用于大型模型的参数高效微调。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MSLoRA: Meta-learned scaling for adaptive fine-tuning of LoRA
Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信