{"title":"具有量化间隔优化ADC和输入位级稀疏优化p20 - dac的8-b MAC操作的电荷域SRAM内存计算宏","authors":"Shukao Dou;Zupei Gu;Heng You;Yi Zhan;Shushan Qiao;Yumei Zhou","doi":"10.1109/TVLSI.2024.3509432","DOIUrl":null,"url":null,"abstract":"Computing-in-memory (CIM) has recently gained significant attention as it achieves high energy efficiency and throughput for deep convolutional neural networks (DCNNs). In this brief, we present a static random access memory (SRAM) CIM macro aimed at improving the energy efficiency of edge devices when performing 8-b multiply-and-accumulate (MAC) operations. The proposed architecture implements the following: 1) a successive approximation register analog-to-digital converter (SAR ADC) readout circuit based on a weight-flip-store (WFS) coding scheme, where energy efficiency is improved by optimizing the quantized interval; 2) an input-relevant partial power-off digital-to-analog converter (P2O-DAC) using input bit-level sparsity to reduce power consumption; and 3) a pipeline structure for interleaving MAC computation and readout operation to minimize the redundancy when loading input data into the CIM array. Our proposed CIM macro is implemented in TSMC 40-nm CMOS technology. Postlayout simulation results show an average macro energy efficiency of 16.8 TOPS/W without input and weight value sparsity.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1467-1471"},"PeriodicalIF":2.8000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Charge Domain SRAM Computing-in-Memory Macro With Quantized Interval-Optimized ADC and Input Bit-Level Sparsity-Optimized P2O-DAC for 8-b MAC Operation\",\"authors\":\"Shukao Dou;Zupei Gu;Heng You;Yi Zhan;Shushan Qiao;Yumei Zhou\",\"doi\":\"10.1109/TVLSI.2024.3509432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computing-in-memory (CIM) has recently gained significant attention as it achieves high energy efficiency and throughput for deep convolutional neural networks (DCNNs). In this brief, we present a static random access memory (SRAM) CIM macro aimed at improving the energy efficiency of edge devices when performing 8-b multiply-and-accumulate (MAC) operations. The proposed architecture implements the following: 1) a successive approximation register analog-to-digital converter (SAR ADC) readout circuit based on a weight-flip-store (WFS) coding scheme, where energy efficiency is improved by optimizing the quantized interval; 2) an input-relevant partial power-off digital-to-analog converter (P2O-DAC) using input bit-level sparsity to reduce power consumption; and 3) a pipeline structure for interleaving MAC computation and readout operation to minimize the redundancy when loading input data into the CIM array. Our proposed CIM macro is implemented in TSMC 40-nm CMOS technology. Postlayout simulation results show an average macro energy efficiency of 16.8 TOPS/W without input and weight value sparsity.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 5\",\"pages\":\"1467-1471\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10778979/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10778979/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
A Charge Domain SRAM Computing-in-Memory Macro With Quantized Interval-Optimized ADC and Input Bit-Level Sparsity-Optimized P2O-DAC for 8-b MAC Operation
Computing-in-memory (CIM) has recently gained significant attention as it achieves high energy efficiency and throughput for deep convolutional neural networks (DCNNs). In this brief, we present a static random access memory (SRAM) CIM macro aimed at improving the energy efficiency of edge devices when performing 8-b multiply-and-accumulate (MAC) operations. The proposed architecture implements the following: 1) a successive approximation register analog-to-digital converter (SAR ADC) readout circuit based on a weight-flip-store (WFS) coding scheme, where energy efficiency is improved by optimizing the quantized interval; 2) an input-relevant partial power-off digital-to-analog converter (P2O-DAC) using input bit-level sparsity to reduce power consumption; and 3) a pipeline structure for interleaving MAC computation and readout operation to minimize the redundancy when loading input data into the CIM array. Our proposed CIM macro is implemented in TSMC 40-nm CMOS technology. Postlayout simulation results show an average macro energy efficiency of 16.8 TOPS/W without input and weight value sparsity.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.