PIM-Prune:基于交叉棒的内存中进程架构的细粒度DCNN剪枝

2020 57th ACM/IEEE Design Automation Conference (DAC) Pub Date : 2020-07-01 DOI:10.1109/DAC18072.2020.9218523

Chaoqun Chu, Yanzhi Wang, Yilong Zhao, Xiaolong Ma, Shaokai Ye, Yunyan Hong, Xiaoyao Liang, Yinhe Han, Li Jiang

{"title":"PIM-Prune:基于交叉棒的内存中进程架构的细粒度DCNN剪枝","authors":"Chaoqun Chu, Yanzhi Wang, Yilong Zhao, Xiaolong Ma, Shaokai Ye, Yunyan Hong, Xiaoyao Liang, Yinhe Han, Li Jiang","doi":"10.1109/DAC18072.2020.9218523","DOIUrl":null,"url":null,"abstract":"Deep Convolution Neural network (DCNN) pruning is an efficient way to reduce the resource and power consumption in a DCNN accelerator. Exploiting the sparsity in the weight matrices of DCNNs, however, is nontrivial if we deploy these DC-NNs in a crossbar-based Process-In-Memory (PIM) architecture, because of the crossbar structure. Structural pruning-exploiting a coarse-grained sparsity, such as filter/channel-level pruning-can result in a compressed weight matrix that fits the crossbar structure. However, this pruning method inevitably degrades the model accuracy. To solve this problem, in this paper, we propose PIM-PRUNE to exploit the finer-grained sparsity in PIM-architecture, and the resulting compressed weight matrices can significantly reduce the demand of crossbars with negligible accuracy loss.Further, we explore the design space of the crossbar, such as the crossbar size and aspect-ratio, from a new point-of-view of resource-oriented pruning. We find a trade-off existing between the pruning algorithm and the hardware overhead: a PIM with smaller crossbars is more friendly for pruning methods; however, the resulting peripheral circuit cause higher power consumption. Given a specific DCNN, we can suggest a sweet-spot of crossbar design to the optimal overall energy efficiency. Experimental results show that the proposed pruning method applied on Resnet18 can achieve up to 24.85× and 3.56× higher compression rate of occupied crossbars on CifarlO and Imagenet, respectively; while the accuracy loss is negligible, which is 4.56× and 1.99× better than the state-of-art methods.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture\",\"authors\":\"Chaoqun Chu, Yanzhi Wang, Yilong Zhao, Xiaolong Ma, Shaokai Ye, Yunyan Hong, Xiaoyao Liang, Yinhe Han, Li Jiang\",\"doi\":\"10.1109/DAC18072.2020.9218523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Convolution Neural network (DCNN) pruning is an efficient way to reduce the resource and power consumption in a DCNN accelerator. Exploiting the sparsity in the weight matrices of DCNNs, however, is nontrivial if we deploy these DC-NNs in a crossbar-based Process-In-Memory (PIM) architecture, because of the crossbar structure. Structural pruning-exploiting a coarse-grained sparsity, such as filter/channel-level pruning-can result in a compressed weight matrix that fits the crossbar structure. However, this pruning method inevitably degrades the model accuracy. To solve this problem, in this paper, we propose PIM-PRUNE to exploit the finer-grained sparsity in PIM-architecture, and the resulting compressed weight matrices can significantly reduce the demand of crossbars with negligible accuracy loss.Further, we explore the design space of the crossbar, such as the crossbar size and aspect-ratio, from a new point-of-view of resource-oriented pruning. We find a trade-off existing between the pruning algorithm and the hardware overhead: a PIM with smaller crossbars is more friendly for pruning methods; however, the resulting peripheral circuit cause higher power consumption. Given a specific DCNN, we can suggest a sweet-spot of crossbar design to the optimal overall energy efficiency. Experimental results show that the proposed pruning method applied on Resnet18 can achieve up to 24.85× and 3.56× higher compression rate of occupied crossbars on CifarlO and Imagenet, respectively; while the accuracy loss is negligible, which is 4.56× and 1.99× better than the state-of-art methods.\",\"PeriodicalId\":428807,\"journal\":{\"name\":\"2020 57th ACM/IEEE Design Automation Conference (DAC)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 57th ACM/IEEE Design Automation Conference (DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DAC18072.2020.9218523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 57th ACM/IEEE Design Automation Conference (DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAC18072.2020.9218523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

深度卷积神经网络(DCNN)剪枝是减少DCNN加速器资源和功耗的有效方法。然而，如果我们将这些dc - nn部署在基于交叉栏的内存进程(PIM)体系结构中，由于交叉栏结构的原因，利用DCNNs权重矩阵中的稀疏性是非常重要的。结构性修剪——利用粗粒度的稀疏性，例如过滤器/通道级修剪——可以产生适合横杆结构的压缩权重矩阵。然而，这种修剪方法不可避免地降低了模型的精度。为了解决这一问题，本文提出了PIM-PRUNE，利用pim -架构中的细粒度稀疏性，得到的压缩权矩阵可以显著减少对横条的需求，且精度损失可以忽略不计。在此基础上，从资源导向剪枝的新视角出发，探讨了横梁的尺寸、纵横比等设计空间。我们发现剪枝算法和硬件开销之间存在权衡:具有较小交叉条的PIM对剪枝方法更友好;然而，由此产生的外围电路导致更高的功耗。给定一个特定的DCNN，我们可以建议一个最佳的横杆设计点，以达到最佳的整体能源效率。实验结果表明，该方法在Resnet18上的压缩率分别比CifarlO和Imagenet上的压缩率提高了24.85倍和3.56倍;而精度损失可以忽略不计，分别比现有方法提高了4.56倍和1.99倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture

Deep Convolution Neural network (DCNN) pruning is an efficient way to reduce the resource and power consumption in a DCNN accelerator. Exploiting the sparsity in the weight matrices of DCNNs, however, is nontrivial if we deploy these DC-NNs in a crossbar-based Process-In-Memory (PIM) architecture, because of the crossbar structure. Structural pruning-exploiting a coarse-grained sparsity, such as filter/channel-level pruning-can result in a compressed weight matrix that fits the crossbar structure. However, this pruning method inevitably degrades the model accuracy. To solve this problem, in this paper, we propose PIM-PRUNE to exploit the finer-grained sparsity in PIM-architecture, and the resulting compressed weight matrices can significantly reduce the demand of crossbars with negligible accuracy loss.Further, we explore the design space of the crossbar, such as the crossbar size and aspect-ratio, from a new point-of-view of resource-oriented pruning. We find a trade-off existing between the pruning algorithm and the hardware overhead: a PIM with smaller crossbars is more friendly for pruning methods; however, the resulting peripheral circuit cause higher power consumption. Given a specific DCNN, we can suggest a sweet-spot of crossbar design to the optimal overall energy efficiency. Experimental results show that the proposed pruning method applied on Resnet18 can achieve up to 24.85× and 3.56× higher compression rate of occupied crossbars on CifarlO and Imagenet, respectively; while the accuracy loss is negligible, which is 4.56× and 1.99× better than the state-of-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 57th ACM/IEEE Design Automation Conference (DAC)

自引率

0.00%

发文量