用于LoRA自适应微调的元学习缩放

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-05-13 DOI:10.1016/j.neucom.2025.130374

Dan Luo , Kangfeng Zheng , Chunhua Wu , Xiujuan Wang

{"title":"用于LoRA自适应微调的元学习缩放","authors":"Dan Luo , Kangfeng Zheng , Chunhua Wu , Xiujuan Wang","doi":"10.1016/j.neucom.2025.130374","DOIUrl":null,"url":null,"abstract":"<div><div>Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"643 ","pages":"Article 130374"},"PeriodicalIF":5.5000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSLoRA: Meta-learned scaling for adaptive fine-tuning of LoRA\",\"authors\":\"Dan Luo , Kangfeng Zheng , Chunhua Wu , Xiujuan Wang\",\"doi\":\"10.1016/j.neucom.2025.130374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"643 \",\"pages\":\"Article 130374\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092523122501046X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092523122501046X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

低秩自适应（LoRA）方法在大型模型的有效微调方面表现出了很强的能力。然而，现有的基于lora的方法通常需要手动设置缩放因子，这个过程需要大量的搜索工作来找到最优值。为了应对这一挑战，我们首先开发了数据驱动的启发式方法，通过前向传播期间的激活模式分析或后向更新期间的梯度行为监控，自动确定分层缩放因子。但在实际应用中，其性能仍不理想。在这些理论基础上，我们提出了MSLoRA，这是一个新的框架，将比例因子确定重新表述为参数有效微调中的动态优化问题。我们的方法创新地将缩放因子建模为自适应元参数，其最优值通过变压器体系结构层次和特定任务学习目标之间的相互作用有机地出现。在自然语言理解和生成任务中进行的大量实验表明，MSLoRA始终优于基线模型。这突出了MSLoRA的动态、特定于层的调整机制在捕获特定于任务的激活模式的复杂性方面的有效性，使其成为一个更健壮和可扩展的解决方案，用于大型模型的参数高效微调。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MSLoRA: Meta-learned scaling for adaptive fine-tuning of LoRA

Low rank adaptation (LoRA) methods have demonstrated strong capabilities in efficiently fine-tuning large models. However, existing LoRA-based approaches typically require manually setting the scaling factor, a process that involves extensive search efforts to find optimal values. To address this challenge, we first develop data-driven heuristic methods that automatically determine layer-wise scaling factors through either activation pattern analysis during forward propagation or gradient behavior monitoring during backward updates. However,their practical performance remains unsatisfactory in applications. Building upon these theoretical foundations, we present MSLoRA, a novel framework that reformulates scaling factor determination as a dynamic optimization problem in parameter-efficient fine-tuning. Our approach innovatively models scaling factors as self-adaptive meta-parameters whose optimal values emerge organically through the interplay between transformer architecture hierarchies and task-specific learning objectives. Extensive experiments conducted across both natural language understanding and generative tasks reveal that MSLoRA consistently outperforms baseline models. This highlights the effectiveness of MSLoRA’s dynamic, layer-specific adjustment mechanism in capturing the complex nature of task-specific activation patterns, making it a more robust and scalable solution for parameter-efficient fine-tuning of large models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.