{"title":"动态CNN加速器支持有效的滤波器生成器与核增强和在线通道修剪","authors":"Chen Tang, Wenyu Sun, Wenxun Wang, Yongpan Liu","doi":"10.1109/ASP-DAC52403.2022.9712483","DOIUrl":null,"url":null,"abstract":"Deep neural network achieves exciting performance in several tasks with heavy storing and computing costs. Previous works adopt pruning-based methods to slim deep network. For traditional pruning, either the convolution kernel or the network inference is static, which cannot fully compress the model parameter and restrains their performance. In this paper, we propose an online pruning algorithm to support dynamic kernel generation and dynamic network inference at the same time. Two novel techniques including the filter generator and the importance-level based channel pruning are proposed. Moreover, we validate the success of the proposed method by the implementation on Ultra96-v2 FPGA. Compared with state-of-art static or dynamic pruning methods, our method can reduce the top-5 accuracy drop by nearly 50% for ResNet model on ImageNet at similar compressing level. It can also achieve better accuracy while up to 50% fewer weights are reduced to be saved on chip.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Dynamic CNN Accelerator Supporting Efficient Filter Generator with Kernel Enhancement and Online Channel Pruning\",\"authors\":\"Chen Tang, Wenyu Sun, Wenxun Wang, Yongpan Liu\",\"doi\":\"10.1109/ASP-DAC52403.2022.9712483\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural network achieves exciting performance in several tasks with heavy storing and computing costs. Previous works adopt pruning-based methods to slim deep network. For traditional pruning, either the convolution kernel or the network inference is static, which cannot fully compress the model parameter and restrains their performance. In this paper, we propose an online pruning algorithm to support dynamic kernel generation and dynamic network inference at the same time. Two novel techniques including the filter generator and the importance-level based channel pruning are proposed. Moreover, we validate the success of the proposed method by the implementation on Ultra96-v2 FPGA. Compared with state-of-art static or dynamic pruning methods, our method can reduce the top-5 accuracy drop by nearly 50% for ResNet model on ImageNet at similar compressing level. It can also achieve better accuracy while up to 50% fewer weights are reduced to be saved on chip.\",\"PeriodicalId\":239260,\"journal\":{\"name\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASP-DAC52403.2022.9712483\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASP-DAC52403.2022.9712483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dynamic CNN Accelerator Supporting Efficient Filter Generator with Kernel Enhancement and Online Channel Pruning
Deep neural network achieves exciting performance in several tasks with heavy storing and computing costs. Previous works adopt pruning-based methods to slim deep network. For traditional pruning, either the convolution kernel or the network inference is static, which cannot fully compress the model parameter and restrains their performance. In this paper, we propose an online pruning algorithm to support dynamic kernel generation and dynamic network inference at the same time. Two novel techniques including the filter generator and the importance-level based channel pruning are proposed. Moreover, we validate the success of the proposed method by the implementation on Ultra96-v2 FPGA. Compared with state-of-art static or dynamic pruning methods, our method can reduce the top-5 accuracy drop by nearly 50% for ResNet model on ImageNet at similar compressing level. It can also achieve better accuracy while up to 50% fewer weights are reduced to be saved on chip.