基于知识精馏和剪枝的神经网络压缩混合方法

2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) Pub Date : 2021-12-17 DOI:10.1109/ICCWAMTIP53232.2021.9674054

Chen Hongle, Shi Qirui, Chen Juan, Wen Quan

{"title":"基于知识精馏和剪枝的神经网络压缩混合方法","authors":"Chen Hongle, Shi Qirui, Chen Juan, Wen Quan","doi":"10.1109/ICCWAMTIP53232.2021.9674054","DOIUrl":null,"url":null,"abstract":"A popular method for shrinking over-parameterized networks nowadays is pruning, which can efficiently reduce the number of computational parameters and computational cost of the network and has almost the same high accuracy as the original network. The general weighted pruning algorithm can only reduce the number of parameters based on the original network structure, but cannot reduce the width and depth of the pruned network. While the knowledge distillation algorithm can solve the problem by compressing the network structure, it cannot make further modifications on the processed network. To further reduce the network structure, we propose a model compression algorithm, HKDP, a hybrid method combining knowledge distillation and network pruning that can significantly reduce the overall size of the network and maintain substantial accuracy. This approach obtains the advantages of knowledge distillation and pruning, which achieves 10 times higher compression rate and 2 percent higher accuracy than using either algorithm alone. Concretely, we apply a stage-wise knowledge distillation algorithm in the front that can quickly and efficiently reduce the original model structure; we also apply a Stochastic Gradient Descent (SGD) based pruning method and introduce the concept of global sparsity, which allows us to customize the compression rate of the model. Our experiments on CIFAR-10 and MNIST show that our hybrid optimization algorithm has higher model accuracy and model compression ratio compared to other competitors' network compression algorithms.","PeriodicalId":358772,"journal":{"name":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HKDP: A Hybrid Approach On Knowledge Distillation and Pruning for Neural Network Compression\",\"authors\":\"Chen Hongle, Shi Qirui, Chen Juan, Wen Quan\",\"doi\":\"10.1109/ICCWAMTIP53232.2021.9674054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A popular method for shrinking over-parameterized networks nowadays is pruning, which can efficiently reduce the number of computational parameters and computational cost of the network and has almost the same high accuracy as the original network. The general weighted pruning algorithm can only reduce the number of parameters based on the original network structure, but cannot reduce the width and depth of the pruned network. While the knowledge distillation algorithm can solve the problem by compressing the network structure, it cannot make further modifications on the processed network. To further reduce the network structure, we propose a model compression algorithm, HKDP, a hybrid method combining knowledge distillation and network pruning that can significantly reduce the overall size of the network and maintain substantial accuracy. This approach obtains the advantages of knowledge distillation and pruning, which achieves 10 times higher compression rate and 2 percent higher accuracy than using either algorithm alone. Concretely, we apply a stage-wise knowledge distillation algorithm in the front that can quickly and efficiently reduce the original model structure; we also apply a Stochastic Gradient Descent (SGD) based pruning method and introduce the concept of global sparsity, which allows us to customize the compression rate of the model. Our experiments on CIFAR-10 and MNIST show that our hybrid optimization algorithm has higher model accuracy and model compression ratio compared to other competitors' network compression algorithms.\",\"PeriodicalId\":358772,\"journal\":{\"name\":\"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP53232.2021.9674054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目前比较流行的一种压缩过参数化网络的方法是剪枝，这种方法可以有效地减少网络的计算参数数量和计算成本，并且具有与原始网络几乎相同的高精度。一般的加权剪枝算法只能在原有网络结构的基础上减少参数的数量，而不能减少剪枝后网络的宽度和深度。知识蒸馏算法可以通过压缩网络结构来解决问题，但不能对处理后的网络进行进一步的修改。为了进一步减少网络结构，我们提出了一种模型压缩算法HKDP，这是一种结合知识蒸馏和网络修剪的混合方法，可以显着减少网络的整体规模并保持较高的准确性。该方法具有知识精馏和剪枝的优点，压缩率比单独使用任一算法提高10倍，准确率提高2%。具体而言，我们在前面采用了一种分阶段的知识精馏算法，可以快速有效地简化原始模型结构;我们还应用了基于随机梯度下降(SGD)的剪枝方法，并引入了全局稀疏性的概念，这使得我们可以自定义模型的压缩率。我们在CIFAR-10和MNIST上的实验表明，与其他竞争对手的网络压缩算法相比，我们的混合优化算法具有更高的模型精度和模型压缩比。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

HKDP: A Hybrid Approach On Knowledge Distillation and Pruning for Neural Network Compression

A popular method for shrinking over-parameterized networks nowadays is pruning, which can efficiently reduce the number of computational parameters and computational cost of the network and has almost the same high accuracy as the original network. The general weighted pruning algorithm can only reduce the number of parameters based on the original network structure, but cannot reduce the width and depth of the pruned network. While the knowledge distillation algorithm can solve the problem by compressing the network structure, it cannot make further modifications on the processed network. To further reduce the network structure, we propose a model compression algorithm, HKDP, a hybrid method combining knowledge distillation and network pruning that can significantly reduce the overall size of the network and maintain substantial accuracy. This approach obtains the advantages of knowledge distillation and pruning, which achieves 10 times higher compression rate and 2 percent higher accuracy than using either algorithm alone. Concretely, we apply a stage-wise knowledge distillation algorithm in the front that can quickly and efficiently reduce the original model structure; we also apply a Stochastic Gradient Descent (SGD) based pruning method and introduce the concept of global sparsity, which allows us to customize the compression rate of the model. Our experiments on CIFAR-10 and MNIST show that our hybrid optimization algorithm has higher model accuracy and model compression ratio compared to other competitors' network compression algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)

自引率

0.00%

发文量