{"title":"A Deep Neural Network Compression Algorithm Based on Knowledge Transfer for Edge Device","authors":"Chao Li, Xiaolong Ma, Zhulin An, Yongjun Xu","doi":"10.1109/SEC.2018.00035","DOIUrl":null,"url":null,"abstract":"The computation and storage capacity of the edge device are limited, which seriously restrict the application of deep neural network in the device. Toward to the intelligent application of the edge device, we introduce the deep neural network compression algorithm based on knowledge transfer, a three-stage pipeline: lightweight, multi-level knowledge transfer and pruning that reduce the network depth, parameter and operation complexity of the deep learning neural networks. We lighten the neural networks by using a global average pooling layer instead of a fully connected layer and replacing a standard convolution with separable convolutions. Next, the multi-level knowledge transfer minimizes the difference between the output of the \"student network\" and the \"teacher network\" in the middle and logits layer, increasing the supervised information when training the \"student network\". Lastly, we prune the network by cuts off the unimportant convolution kernels with a global iterative pruning strategy.","PeriodicalId":376439,"journal":{"name":"2018 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEC.2018.00035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The computation and storage capacity of the edge device are limited, which seriously restrict the application of deep neural network in the device. Toward to the intelligent application of the edge device, we introduce the deep neural network compression algorithm based on knowledge transfer, a three-stage pipeline: lightweight, multi-level knowledge transfer and pruning that reduce the network depth, parameter and operation complexity of the deep learning neural networks. We lighten the neural networks by using a global average pooling layer instead of a fully connected layer and replacing a standard convolution with separable convolutions. Next, the multi-level knowledge transfer minimizes the difference between the output of the "student network" and the "teacher network" in the middle and logits layer, increasing the supervised information when training the "student network". Lastly, we prune the network by cuts off the unimportant convolution kernels with a global iterative pruning strategy.