Deep Neural Network Compression Method Based on Product Quantization

2020 39th Chinese Control Conference (CCC) Pub Date : 2020-07-01 DOI:10.23919/CCC50068.2020.9188698

Xiuqin Fang, Han Liu, Guo Xie, Youmin Zhang, Ding Liu

引用次数: 3

Abstract

In this paper a method based on the combination of product quantization and pruning to compress deep neural network with large size model and great amount of calculation is proposed. First of all, we use pruning to reduce redundant parameters in deep neural network, and then refine the tune network for fine tuning. Then we use product quantization to quantize the parameters of the neural network to 8 bits, which reduces the storage overhead so that the deep neural network can be deployed in embedded devices. For the classification tasks in the Mnist dataset and Cifar10 dataset, the network models such as LeNet5, AlexNet, ResNet are compressed to 23 to 38 times without losing accuracy as much as possible.

查看原文本刊更多论文

基于积量化的深度神经网络压缩方法

本文提出了一种基于积量化和剪枝相结合的方法来压缩模型大、计算量大的深度神经网络。首先对深度神经网络进行剪枝，减少冗余参数，然后对调谐网络进行微调。然后利用积量化将神经网络的参数量化到8位，减少了存储开销，使深度神经网络能够部署在嵌入式设备中。对于Mnist数据集和Cifar10数据集中的分类任务，LeNet5、AlexNet、ResNet等网络模型被压缩到23 ~ 38倍，同时尽量不损失准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 39th Chinese Control Conference (CCC)

自引率

0.00%

发文量