基于模型剪枝和知识蒸馏的超参数自动优化

2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC) Pub Date : 2020-11-01 DOI:10.1109/ICCEIC51584.2020.00030

Min Wu, Weihua Ma, Yue Li, Xiongbo Zhao

{"title":"基于模型剪枝和知识蒸馏的超参数自动优化","authors":"Min Wu, Weihua Ma, Yue Li, Xiongbo Zhao","doi":"10.1109/ICCEIC51584.2020.00030","DOIUrl":null,"url":null,"abstract":"In recent years, deep neural network has been widely used in computer vision, speech recognition and other fields. However, to obtain better performance, it needs to design a network with higher complexity, and the corresponding model calculation amount and storage space are also increasing. At the same time, the computing resources and energy consumption budget of mobile devices are very limited. Therefore, model compression is very important for deploying neural network models on mobile devices. Knowledge distillation technology based on transfer learning is an effective method to realize model compression. This study proposes: the model pruning technology is introduced into the student network design of knowledge distillation, and the super parameters (temperature T, scale factor λ, pruning rate ϒ) are automatically optimized, and the optimal combination of parameters is selected as the final value according to the final performance. The results show that, compared with the commonly used pruning techniques, this method can effectively improve the accuracy of the network without increasing the network size, and the network performance can be further improved by adjusting the value of super parameters.","PeriodicalId":135840,"journal":{"name":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic Optimization of super Parameters Based on Model Pruning and Knowledge Distillation\",\"authors\":\"Min Wu, Weihua Ma, Yue Li, Xiongbo Zhao\",\"doi\":\"10.1109/ICCEIC51584.2020.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, deep neural network has been widely used in computer vision, speech recognition and other fields. However, to obtain better performance, it needs to design a network with higher complexity, and the corresponding model calculation amount and storage space are also increasing. At the same time, the computing resources and energy consumption budget of mobile devices are very limited. Therefore, model compression is very important for deploying neural network models on mobile devices. Knowledge distillation technology based on transfer learning is an effective method to realize model compression. This study proposes: the model pruning technology is introduced into the student network design of knowledge distillation, and the super parameters (temperature T, scale factor λ, pruning rate ϒ) are automatically optimized, and the optimal combination of parameters is selected as the final value according to the final performance. The results show that, compared with the commonly used pruning techniques, this method can effectively improve the accuracy of the network without increasing the network size, and the network performance can be further improved by adjusting the value of super parameters.\",\"PeriodicalId\":135840,\"journal\":{\"name\":\"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCEIC51584.2020.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEIC51584.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度神经网络在计算机视觉、语音识别等领域得到了广泛应用。但是，为了获得更好的性能，需要设计复杂度更高的网络，相应的模型计算量和存储空间也在不断增加。同时，移动设备的计算资源和能耗预算非常有限。因此，模型压缩对于在移动设备上部署神经网络模型非常重要。基于迁移学习的知识蒸馏技术是实现模型压缩的有效方法。本研究提出:将模型剪枝技术引入知识蒸馏的学生网络设计中，自动优化超参数(温度T、尺度因子λ、剪枝率γ)，并根据最终性能选择参数的最优组合作为最终值。结果表明，与常用的剪枝技术相比，该方法可以在不增加网络规模的情况下有效提高网络的准确率，并且通过调整超参数的值可以进一步提高网络性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic Optimization of super Parameters Based on Model Pruning and Knowledge Distillation

In recent years, deep neural network has been widely used in computer vision, speech recognition and other fields. However, to obtain better performance, it needs to design a network with higher complexity, and the corresponding model calculation amount and storage space are also increasing. At the same time, the computing resources and energy consumption budget of mobile devices are very limited. Therefore, model compression is very important for deploying neural network models on mobile devices. Knowledge distillation technology based on transfer learning is an effective method to realize model compression. This study proposes: the model pruning technology is introduced into the student network design of knowledge distillation, and the super parameters (temperature T, scale factor λ, pruning rate ϒ) are automatically optimized, and the optimal combination of parameters is selected as the final value according to the final performance. The results show that, compared with the commonly used pruning techniques, this method can effectively improve the accuracy of the network without increasing the network size, and the network performance can be further improved by adjusting the value of super parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)

自引率

0.00%

发文量