A 40nm RRAM Compute-in-Memory Macro with Parallelism-Preserving ECC for Iso-Accuracy Voltage Scaling

ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC) Pub Date : 2022-09-19 DOI:10.1109/ESSCIRC55480.2022.9911464

Wantong Li, James Read, Hongwu Jiang, Shimeng Yu

{"title":"A 40nm RRAM Compute-in-Memory Macro with Parallelism-Preserving ECC for Iso-Accuracy Voltage Scaling","authors":"Wantong Li, James Read, Hongwu Jiang, Shimeng Yu","doi":"10.1109/ESSCIRC55480.2022.9911464","DOIUrl":null,"url":null,"abstract":"Compute-in-memory (CIM) employing resistive random access memory (RRAM) has been widely investigated as an attractive candidate to accelerate the heavy multiply-and-accumulate (MAC) workloads in deep neural networks (DNNs) inference. Supply voltage (VDD) scaling for compute engines is a popular technique to allow edge devices to toggle between high-performance and low-power modes. While prior CIM works have examined VDD scaling, they have not explored its effects on hardware errors and inference accuracy. In this work, we design and validate an RRAM-based CIM macro with a novel error correction code (ECC), called MAC-ECC, that can be reconfigured to correct errors arising from scaled VDD while preserving the parallelism of CIM. This enables RRAM-CIM to perform iso-accuracy inference across different operation modes. We design specialized hardware to implement the MAC-ECC decoder and insert it into the existing compute pipeline without throughput overhead. Additionally, we conduct measurements to characterize the effect of VDD scaling on errors in CIM. The macro is taped-out in TSMC N40 RRAM process, and for $1\\times 1b$ MAC operations on DenseNet-40 network it achieves 59.1 TOPS/W and 70.9 GOPS/mm2 at VDD of 0.7V, and 43.0 TOPS/W and 112.5 GOPS/mm2 at VDD of 1.0V. The design maintains <1% accuracy loss on the CIFAR-10 dataset across the tested VDDs.","PeriodicalId":168466,"journal":{"name":"ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESSCIRC55480.2022.9911464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Compute-in-memory (CIM) employing resistive random access memory (RRAM) has been widely investigated as an attractive candidate to accelerate the heavy multiply-and-accumulate (MAC) workloads in deep neural networks (DNNs) inference. Supply voltage (VDD) scaling for compute engines is a popular technique to allow edge devices to toggle between high-performance and low-power modes. While prior CIM works have examined VDD scaling, they have not explored its effects on hardware errors and inference accuracy. In this work, we design and validate an RRAM-based CIM macro with a novel error correction code (ECC), called MAC-ECC, that can be reconfigured to correct errors arising from scaled VDD while preserving the parallelism of CIM. This enables RRAM-CIM to perform iso-accuracy inference across different operation modes. We design specialized hardware to implement the MAC-ECC decoder and insert it into the existing compute pipeline without throughput overhead. Additionally, we conduct measurements to characterize the effect of VDD scaling on errors in CIM. The macro is taped-out in TSMC N40 RRAM process, and for $1\times 1b$ MAC operations on DenseNet-40 network it achieves 59.1 TOPS/W and 70.9 GOPS/mm2 at VDD of 0.7V, and 43.0 TOPS/W and 112.5 GOPS/mm2 at VDD of 1.0V. The design maintains <1% accuracy loss on the CIFAR-10 dataset across the tested VDDs.

查看原文本刊更多论文

具有保持并行性ECC的40nm RRAM内存中计算宏用于等精度电压缩放

采用电阻性随机存取存储器(RRAM)的内存计算(CIM)作为加速深度神经网络(dnn)推理中繁重的乘法累加(MAC)工作负载的有吸引力的候选者被广泛研究。计算引擎的电源电压(VDD)缩放是一种流行的技术，它允许边缘设备在高性能和低功耗模式之间切换。虽然以前的CIM工作已经研究了VDD缩放，但他们没有探索其对硬件错误和推理精度的影响。在这项工作中，我们设计并验证了一个基于rram的CIM宏，该宏具有新颖的纠错码(ECC)，称为MAC-ECC，可以重新配置以纠正缩放VDD引起的错误，同时保持CIM的并行性。这使得RRAM-CIM能够跨不同的操作模式执行等精度推断。我们设计了专门的硬件来实现MAC-ECC解码器，并将其插入现有的计算管道中，而不会产生吞吐量开销。此外，我们进行了测量，以表征VDD缩放对CIM误差的影响。该宏在TSMC N40 RRAM工艺中封装，在DenseNet-40网络上进行$1\times 1b$ MAC操作，在VDD为0.7V时可达到59.1 TOPS/W和70.9 GOPS/mm2，在VDD为1.0V时可达到43.0 TOPS/W和112.5 GOPS/mm2。该设计在测试的vdd上保持了<1%的CIFAR-10数据集精度损失。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC)

自引率

0.00%

发文量