An Accuracy-Preserving Neural Network Compression via Tucker Decomposition

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Sustainable Computing Pub Date : 2024-07-29 DOI:10.1109/TSUSC.2024.3425962

Can Liu;Kun Xie;Jigang Wen;Gaogang Xie;Kenli Li

{"title":"An Accuracy-Preserving Neural Network Compression via Tucker Decomposition","authors":"Can Liu;Kun Xie;Jigang Wen;Gaogang Xie;Kenli Li","doi":"10.1109/TSUSC.2024.3425962","DOIUrl":null,"url":null,"abstract":"Deep learning has made remarkable progress across many domains, enabled by the capabilities of over-parameterized neural networks with increasing complexity. However, practical applications often necessitate compact and efficient networks because of device constraints. Among recent low-rank decomposition-based neural network compression techniques, Tucker decomposition has emerged as a promising method which effectively compresses the network while preserving the high-order structure and information of the parameters. Despite its promise, designing an efficient Tucker decomposition approach for compressing neural networks while maintaining accuracy is challenging, due to the complexity of setting ranks across multiple layers and the need for extensive fine-tuning. This paper introduces a novel accuracy-aware network compression problem under Tucker decomposition, which considers both network accuracy and compression performance in terms of parameter size. To address this problem, we propose an efficient alternating optimization algorithm that iteratively solves a network training sub-problem and a Tucker decomposition sub-problem to compress the network with performance assurance. The proper Tucker ranks of multiple layers are selected during network training, enabling efficient compression without extensive fine-tuning. We conduct extensive experiments, implementing image classification on five neural networks using four benchmark datasets. The experimental results demonstrate that, without the need for extensive fine-tuning, our proposed method significantly reduces the model size with minimal loss in accuracy, outperforming baseline methods.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"10 2","pages":"262-273"},"PeriodicalIF":3.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10614384/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning has made remarkable progress across many domains, enabled by the capabilities of over-parameterized neural networks with increasing complexity. However, practical applications often necessitate compact and efficient networks because of device constraints. Among recent low-rank decomposition-based neural network compression techniques, Tucker decomposition has emerged as a promising method which effectively compresses the network while preserving the high-order structure and information of the parameters. Despite its promise, designing an efficient Tucker decomposition approach for compressing neural networks while maintaining accuracy is challenging, due to the complexity of setting ranks across multiple layers and the need for extensive fine-tuning. This paper introduces a novel accuracy-aware network compression problem under Tucker decomposition, which considers both network accuracy and compression performance in terms of parameter size. To address this problem, we propose an efficient alternating optimization algorithm that iteratively solves a network training sub-problem and a Tucker decomposition sub-problem to compress the network with performance assurance. The proper Tucker ranks of multiple layers are selected during network training, enabling efficient compression without extensive fine-tuning. We conduct extensive experiments, implementing image classification on five neural networks using four benchmark datasets. The experimental results demonstrate that, without the need for extensive fine-tuning, our proposed method significantly reduces the model size with minimal loss in accuracy, outperforming baseline methods.

查看原文本刊更多论文

基于Tucker分解的保持精度的神经网络压缩

深度学习在许多领域取得了显著的进展，这是由于过度参数化神经网络的能力越来越复杂。然而，由于设备的限制，实际应用往往需要紧凑和高效的网络。在近年来基于低秩分解的神经网络压缩技术中，Tucker分解作为一种有效压缩神经网络的方法，在保留神经网络高阶结构和参数信息的同时，得到了广泛的应用。尽管它很有前途，但设计一种高效的Tucker分解方法来压缩神经网络，同时保持准确性是具有挑战性的，因为跨多层设置秩的复杂性和需要广泛的微调。本文提出了一种基于Tucker分解的精度感知网络压缩问题，该问题从参数大小两个方面考虑了网络的精度和压缩性能。为了解决这一问题，我们提出了一种高效的交替优化算法，迭代解决网络训练子问题和塔克分解子问题，在保证性能的情况下压缩网络。在网络训练过程中选择合适的多层Tucker秩，无需大量微调即可实现高效压缩。我们进行了广泛的实验，使用四个基准数据集在五个神经网络上实现图像分类。实验结果表明，在不需要大量微调的情况下，我们提出的方法在精度损失最小的情况下显著减小了模型尺寸，优于基线方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Sustainable Computing Mathematics-Control and Optimization

CiteScore

7.70

自引率

2.60%

发文量