{"title":"Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration","authors":"Yongshuai Sun, Mengyuan Guo, Dacheng Liang, Shan Tang, Naifeng Jing","doi":"10.1109/ASICON52560.2021.9620448","DOIUrl":null,"url":null,"abstract":"Data sparsity is important in accelerating deep neural networks (DNNs). However, besides the zeroed values, the bit sparsity especially in activations are oftentimes missing in conventional DNN accelerators. In this paper, we present a DNN accelerator to exploit the bit sparsity by dynamically skipping zeroed bits in activations. To this goal, we first substitute the multiply-and-accumulate (MAC) units with more serial shift-and-accumulate units to sustain the computing parallelism. To prevent the low efficiency caused by the random number and positions of the zeroed bits in different activations, we propose activation-grouping, so that the activations in the same group can be computed on non-zero bits in different channels freely, and synchronization is only needed between different groups. We implement the proposed accelerator with 16 process units (PU) and 16 processing elements (PE) in each PU on FPGA built upon VTA (Versatile Tensor Accelerator) which can integrate seamlessly with TVM compilation. We evaluate the efficiency of our design with convolutional layers in resnet18 respectively, which achieves over 3.2x speedup on average compared with VTA design. In terms of the whole network, it can achieve over 2.26x speedup and over 2.0x improvement on area efficiency.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 14th International Conference on ASIC (ASICON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASICON52560.2021.9620448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data sparsity is important in accelerating deep neural networks (DNNs). However, besides the zeroed values, the bit sparsity especially in activations are oftentimes missing in conventional DNN accelerators. In this paper, we present a DNN accelerator to exploit the bit sparsity by dynamically skipping zeroed bits in activations. To this goal, we first substitute the multiply-and-accumulate (MAC) units with more serial shift-and-accumulate units to sustain the computing parallelism. To prevent the low efficiency caused by the random number and positions of the zeroed bits in different activations, we propose activation-grouping, so that the activations in the same group can be computed on non-zero bits in different channels freely, and synchronization is only needed between different groups. We implement the proposed accelerator with 16 process units (PU) and 16 processing elements (PE) in each PU on FPGA built upon VTA (Versatile Tensor Accelerator) which can integrate seamlessly with TVM compilation. We evaluate the efficiency of our design with convolutional layers in resnet18 respectively, which achieves over 3.2x speedup on average compared with VTA design. In terms of the whole network, it can achieve over 2.26x speedup and over 2.0x improvement on area efficiency.