Xiangxiang Gao, Weisheng Xie , Yuhan Lin, Chen Hang, Hongyang Han, Xiaolong Xu, Bo Liu
{"title":"GF-SVD: Global knowledge-infused singular value decomposition of large language models","authors":"Xiangxiang Gao, Weisheng Xie , Yuhan Lin, Chen Hang, Hongyang Han, Xiaolong Xu, Bo Liu","doi":"10.1016/j.inffus.2025.103774","DOIUrl":null,"url":null,"abstract":"<div><div>Singular Value Decomposition (SVD) provides an efficient solution for compressing and accelerating Large Language Models (LLMs) without retraining or specialized hardware. Despite its advantages, current SVD-based LLMs compression methods suffer from three critical limitations that degrade performance: (1) Cross-domain knowledge preservation is compromised, (2) Layer-isolated decomposition disrupts inter-layer information flow, and (3) Gradual knowledge erosion caused by aggressive truncation of singular values and corresponding vectors. To overcome these, we propose GF-SVD, a novel framework that integrates: <strong>(1) Hierarchical Knowledge Infusion:</strong> Enhances dataset diversity by integrating hierarchical knowledge to improve cross-domain generalization, <strong>(2) Global Information Integration:</strong> Captures inter-layer dependencies and broader context via weighted aggregation of multi-layer feature matrices, and <strong>(3) Knowledge-Enhanced Truncation and Updating:</strong> Truncates and updates weights with infused dataset to mitigate knowledge erosion. Extensive experiments demonstrate that GF-SVD surpasses existing SVD-based LLMs compression methods across diverse tasks, including knowledge-intensive question answering, complex reasoning, physical system, and mathematical problem-solving. Notably, GF-SVD can also improve inference speed by 2.36x on GPUs and 2.74x on CPUs at 60 % compression ratio.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103774"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500836X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Singular Value Decomposition (SVD) provides an efficient solution for compressing and accelerating Large Language Models (LLMs) without retraining or specialized hardware. Despite its advantages, current SVD-based LLMs compression methods suffer from three critical limitations that degrade performance: (1) Cross-domain knowledge preservation is compromised, (2) Layer-isolated decomposition disrupts inter-layer information flow, and (3) Gradual knowledge erosion caused by aggressive truncation of singular values and corresponding vectors. To overcome these, we propose GF-SVD, a novel framework that integrates: (1) Hierarchical Knowledge Infusion: Enhances dataset diversity by integrating hierarchical knowledge to improve cross-domain generalization, (2) Global Information Integration: Captures inter-layer dependencies and broader context via weighted aggregation of multi-layer feature matrices, and (3) Knowledge-Enhanced Truncation and Updating: Truncates and updates weights with infused dataset to mitigate knowledge erosion. Extensive experiments demonstrate that GF-SVD surpasses existing SVD-based LLMs compression methods across diverse tasks, including knowledge-intensive question answering, complex reasoning, physical system, and mathematical problem-solving. Notably, GF-SVD can also improve inference speed by 2.36x on GPUs and 2.74x on CPUs at 60 % compression ratio.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.