{"title":"LoRA2: Multi-scale low-rank approximations for fine-tuning large language models","authors":"Jia-Chen Zhang , Yu-Jie Xiong , Chun-Ming Xia , Dong-Hai Zhu , Hong-Jian Zhan","doi":"10.1016/j.neucom.2025.130859","DOIUrl":null,"url":null,"abstract":"<div><div>Fine-tuning large language models (LLMs) with high parameter efficiency for downstream tasks has become a new paradigm. Low-Rank Adaptation (LoRA) significantly reduces the number of trainable parameters for fine-tuning. Although it has demonstrated commendable performance, updating parameters within a single scale may not be the optimal choice for complex downstream tasks. In this paper, we extend the LoRA to multiple scales, dubbed as LoRA<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>. We first combine orthogonal projection theory to train two low-dimensional LoRAs in two mutually orthogonal planes. By multiplying two LoRAs, a high-dimensional LoRA is obtained, forming a multi-scale LoRA. Then, we improve the importance score algorithm, significantly reducing the computation required for parameter sensitivity scoring. By pruning singular values with lower importance scores, thereby enhancing adaptability to various downstream tasks. Extensive experiments are conducted on two widely used pre-trained models to validate the effectiveness of LoRA<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>. Results show that it significantly reduces the number of trainable parameters to just 0.72% compared to full fine-tuning on the DeBERTa-V3-base model, while still delivering highly impressive performance. Our code is available here: <span><span>https://github.com/Godz-z/LoRA-2</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130859"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015310","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Fine-tuning large language models (LLMs) with high parameter efficiency for downstream tasks has become a new paradigm. Low-Rank Adaptation (LoRA) significantly reduces the number of trainable parameters for fine-tuning. Although it has demonstrated commendable performance, updating parameters within a single scale may not be the optimal choice for complex downstream tasks. In this paper, we extend the LoRA to multiple scales, dubbed as LoRA. We first combine orthogonal projection theory to train two low-dimensional LoRAs in two mutually orthogonal planes. By multiplying two LoRAs, a high-dimensional LoRA is obtained, forming a multi-scale LoRA. Then, we improve the importance score algorithm, significantly reducing the computation required for parameter sensitivity scoring. By pruning singular values with lower importance scores, thereby enhancing adaptability to various downstream tasks. Extensive experiments are conducted on two widely used pre-trained models to validate the effectiveness of LoRA. Results show that it significantly reduces the number of trainable parameters to just 0.72% compared to full fine-tuning on the DeBERTa-V3-base model, while still delivering highly impressive performance. Our code is available here: https://github.com/Godz-z/LoRA-2.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.