On Compressing Deep Models by Low Rank and Sparse Decomposition

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI:10.1109/CVPR.2017.15

Xiyu Yu, Tongliang Liu, Xinchao Wang, D. Tao

{"title":"On Compressing Deep Models by Low Rank and Sparse Decomposition","authors":"Xiyu Yu, Tongliang Liu, Xinchao Wang, D. Tao","doi":"10.1109/CVPR.2017.15","DOIUrl":null,"url":null,"abstract":"Deep compression refers to removing the redundancy of parameters and feature maps for deep learning models. Low-rank approximation and pruning for sparse structures play a vital role in many compression works. However, weight filters tend to be both low-rank and sparse. Neglecting either part of these structure information in previous methods results in iteratively retraining, compromising accuracy, and low compression rates. Here we propose a unified framework integrating the low-rank and sparse decomposition of weight matrices with the feature map reconstructions. Our model includes methods like pruning connections as special cases, and is optimized by a fast SVD-free algorithm. It has been theoretically proven that, with a small sample, due to its generalizability, our model can well reconstruct the feature maps on both training and test data, which results in less compromising accuracy prior to the subsequent retraining. With such a warm start to retrain, the compression method always possesses several merits: (a) higher compression rates, (b) little loss of accuracy, and (c) fewer rounds to compress deep models. The experimental results on several popular models such as AlexNet, VGG-16, and GoogLeNet show that our model can significantly reduce the parameters for both convolutional and fully-connected layers. As a result, our model reduces the size of VGG-16 by 15×, better than other recent compression methods that use a single strategy.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"300 1","pages":"67-76"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"349","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 349

Abstract

Deep compression refers to removing the redundancy of parameters and feature maps for deep learning models. Low-rank approximation and pruning for sparse structures play a vital role in many compression works. However, weight filters tend to be both low-rank and sparse. Neglecting either part of these structure information in previous methods results in iteratively retraining, compromising accuracy, and low compression rates. Here we propose a unified framework integrating the low-rank and sparse decomposition of weight matrices with the feature map reconstructions. Our model includes methods like pruning connections as special cases, and is optimized by a fast SVD-free algorithm. It has been theoretically proven that, with a small sample, due to its generalizability, our model can well reconstruct the feature maps on both training and test data, which results in less compromising accuracy prior to the subsequent retraining. With such a warm start to retrain, the compression method always possesses several merits: (a) higher compression rates, (b) little loss of accuracy, and (c) fewer rounds to compress deep models. The experimental results on several popular models such as AlexNet, VGG-16, and GoogLeNet show that our model can significantly reduce the parameters for both convolutional and fully-connected layers. As a result, our model reduces the size of VGG-16 by 15×, better than other recent compression methods that use a single strategy.

查看原文本刊更多论文

基于低秩和稀疏分解的深度模型压缩

深度压缩是指去除深度学习模型中参数和特征映射的冗余。稀疏结构的低秩逼近和剪枝在许多压缩工作中起着至关重要的作用。然而，权重过滤器往往是低秩和稀疏的。在以前的方法中忽略这些结构信息的任何一部分都会导致迭代再训练，影响准确性和低压缩率。在此，我们提出了一种将权矩阵的低秩稀疏分解与特征映射重构相结合的统一框架。我们的模型将修剪连接等方法作为特殊情况，并通过快速无奇异值分解算法进行优化。从理论上证明，在小样本情况下，由于其可泛化性，我们的模型可以很好地重建训练和测试数据上的特征映射，从而在随后的再训练之前降低准确性。有了这样一个温暖的开始再训练，压缩方法总是具有几个优点:(a)更高的压缩率，(b)精度损失小，(c)压缩深度模型的回合更少。在AlexNet、VGG-16和GoogLeNet等几个流行模型上的实验结果表明，我们的模型可以显著降低卷积层和全连接层的参数。因此，我们的模型将VGG-16的大小减少了15×，比最近使用单一策略的其他压缩方法要好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量