{"title":"基于Tucker分解的保持精度的神经网络压缩","authors":"Can Liu;Kun Xie;Jigang Wen;Gaogang Xie;Kenli Li","doi":"10.1109/TSUSC.2024.3425962","DOIUrl":null,"url":null,"abstract":"Deep learning has made remarkable progress across many domains, enabled by the capabilities of over-parameterized neural networks with increasing complexity. However, practical applications often necessitate compact and efficient networks because of device constraints. Among recent low-rank decomposition-based neural network compression techniques, Tucker decomposition has emerged as a promising method which effectively compresses the network while preserving the high-order structure and information of the parameters. Despite its promise, designing an efficient Tucker decomposition approach for compressing neural networks while maintaining accuracy is challenging, due to the complexity of setting ranks across multiple layers and the need for extensive fine-tuning. This paper introduces a novel accuracy-aware network compression problem under Tucker decomposition, which considers both network accuracy and compression performance in terms of parameter size. To address this problem, we propose an efficient alternating optimization algorithm that iteratively solves a network training sub-problem and a Tucker decomposition sub-problem to compress the network with performance assurance. The proper Tucker ranks of multiple layers are selected during network training, enabling efficient compression without extensive fine-tuning. We conduct extensive experiments, implementing image classification on five neural networks using four benchmark datasets. The experimental results demonstrate that, without the need for extensive fine-tuning, our proposed method significantly reduces the model size with minimal loss in accuracy, outperforming baseline methods.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"10 2","pages":"262-273"},"PeriodicalIF":3.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Accuracy-Preserving Neural Network Compression via Tucker Decomposition\",\"authors\":\"Can Liu;Kun Xie;Jigang Wen;Gaogang Xie;Kenli Li\",\"doi\":\"10.1109/TSUSC.2024.3425962\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning has made remarkable progress across many domains, enabled by the capabilities of over-parameterized neural networks with increasing complexity. However, practical applications often necessitate compact and efficient networks because of device constraints. Among recent low-rank decomposition-based neural network compression techniques, Tucker decomposition has emerged as a promising method which effectively compresses the network while preserving the high-order structure and information of the parameters. Despite its promise, designing an efficient Tucker decomposition approach for compressing neural networks while maintaining accuracy is challenging, due to the complexity of setting ranks across multiple layers and the need for extensive fine-tuning. This paper introduces a novel accuracy-aware network compression problem under Tucker decomposition, which considers both network accuracy and compression performance in terms of parameter size. To address this problem, we propose an efficient alternating optimization algorithm that iteratively solves a network training sub-problem and a Tucker decomposition sub-problem to compress the network with performance assurance. The proper Tucker ranks of multiple layers are selected during network training, enabling efficient compression without extensive fine-tuning. We conduct extensive experiments, implementing image classification on five neural networks using four benchmark datasets. The experimental results demonstrate that, without the need for extensive fine-tuning, our proposed method significantly reduces the model size with minimal loss in accuracy, outperforming baseline methods.\",\"PeriodicalId\":13268,\"journal\":{\"name\":\"IEEE Transactions on Sustainable Computing\",\"volume\":\"10 2\",\"pages\":\"262-273\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Sustainable Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10614384/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10614384/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
An Accuracy-Preserving Neural Network Compression via Tucker Decomposition
Deep learning has made remarkable progress across many domains, enabled by the capabilities of over-parameterized neural networks with increasing complexity. However, practical applications often necessitate compact and efficient networks because of device constraints. Among recent low-rank decomposition-based neural network compression techniques, Tucker decomposition has emerged as a promising method which effectively compresses the network while preserving the high-order structure and information of the parameters. Despite its promise, designing an efficient Tucker decomposition approach for compressing neural networks while maintaining accuracy is challenging, due to the complexity of setting ranks across multiple layers and the need for extensive fine-tuning. This paper introduces a novel accuracy-aware network compression problem under Tucker decomposition, which considers both network accuracy and compression performance in terms of parameter size. To address this problem, we propose an efficient alternating optimization algorithm that iteratively solves a network training sub-problem and a Tucker decomposition sub-problem to compress the network with performance assurance. The proper Tucker ranks of multiple layers are selected during network training, enabling efficient compression without extensive fine-tuning. We conduct extensive experiments, implementing image classification on five neural networks using four benchmark datasets. The experimental results demonstrate that, without the need for extensive fine-tuning, our proposed method significantly reduces the model size with minimal loss in accuracy, outperforming baseline methods.