Jihoon Kim, Juhyoung Lee, Jinsu Lee, H. Yoo, Joo-Young Kim
{"title":"Z-PIM: An Energy-Efficient Sparsity Aware Processing-In-Memory Architecture with Fully-Variable Weight Precision","authors":"Jihoon Kim, Juhyoung Lee, Jinsu Lee, H. Yoo, Joo-Young Kim","doi":"10.1109/VLSICircuits18222.2020.9163015","DOIUrl":null,"url":null,"abstract":"This paper presents Z-PIM, an energy-efficient processing-in-memory (PIM) architecture that supports zero-skipping operations and fully-variable weight bit-precision for efficient deep neural network (DNN). The 8T-SRAM cell based bit-serial operation with hierarchical bit-line structure enables variable weight precision and reduces bit-line switching by 95.42% in convolution layers of VGG-16. Z-PIM handles abundant zeros in weight data by skip-reading their corresponding input data while read-sequence rearranging and pipelining improves throughput by 66.1%. In addition, diagonal accumulation logic is proposed to accumulate both partial-sums for bit-serial operation and spatial products. As a result, the Z-PIM chip fabricated in a 65nm process consumes average 5.294mW power and achieves 0.31–49.12 TOPS/W energy efficiency for convolution operations as sparsity and weight bit-precision vary from 0.1 to 0.9 and 1b to 16b, respectively.","PeriodicalId":252787,"journal":{"name":"2020 IEEE Symposium on VLSI Circuits","volume":"61 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Symposium on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSICircuits18222.2020.9163015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
This paper presents Z-PIM, an energy-efficient processing-in-memory (PIM) architecture that supports zero-skipping operations and fully-variable weight bit-precision for efficient deep neural network (DNN). The 8T-SRAM cell based bit-serial operation with hierarchical bit-line structure enables variable weight precision and reduces bit-line switching by 95.42% in convolution layers of VGG-16. Z-PIM handles abundant zeros in weight data by skip-reading their corresponding input data while read-sequence rearranging and pipelining improves throughput by 66.1%. In addition, diagonal accumulation logic is proposed to accumulate both partial-sums for bit-serial operation and spatial products. As a result, the Z-PIM chip fabricated in a 65nm process consumes average 5.294mW power and achieves 0.31–49.12 TOPS/W energy efficiency for convolution operations as sparsity and weight bit-precision vary from 0.1 to 0.9 and 1b to 16b, respectively.