Yan-Cheng Guo, Wei-Tien Lin, T. Hou, Tian-Sheuan Chang
{"title":"FPCIM: A Fully-Parallel Robust ReRAM CIM Processor for Edge AI Devices","authors":"Yan-Cheng Guo, Wei-Tien Lin, T. Hou, Tian-Sheuan Chang","doi":"10.1109/ISCAS46773.2023.10181402","DOIUrl":null,"url":null,"abstract":"Computing-in-memory (CIM) is popular for deep learning due to its high energy efficiency owing to massive parallelism and low data movement. However, current ReRAM based CIM designs only use partial parallelism since fully parallel CIM could suffer lower model accuracy due to severe nonideal effects. This paper proposes a robust fully-parallel ReRAM-based CIM processor for deep learning. The proposed design exploits the fully-parallel computation of a $1024\\mathrm{x}1024$ array to achieve 110.59 TOPS and reduces nonideal effects with in-ReRAM computing (IRC) training and hybrid digital/IRC design to minimize the accuracy loss with only 1.55%. This design is programmable with a compact CIM-oriented instruction set to support various 2-D convolution neural networks (NN) as well as hybrid digital/IRC designs. The final implementation achieves a 2740.41 TOPS/W energy efficiency at 125MHz with TSMC 40nm technology, which is superior to previous designs.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAS46773.2023.10181402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Computing-in-memory (CIM) is popular for deep learning due to its high energy efficiency owing to massive parallelism and low data movement. However, current ReRAM based CIM designs only use partial parallelism since fully parallel CIM could suffer lower model accuracy due to severe nonideal effects. This paper proposes a robust fully-parallel ReRAM-based CIM processor for deep learning. The proposed design exploits the fully-parallel computation of a $1024\mathrm{x}1024$ array to achieve 110.59 TOPS and reduces nonideal effects with in-ReRAM computing (IRC) training and hybrid digital/IRC design to minimize the accuracy loss with only 1.55%. This design is programmable with a compact CIM-oriented instruction set to support various 2-D convolution neural networks (NN) as well as hybrid digital/IRC designs. The final implementation achieves a 2740.41 TOPS/W energy efficiency at 125MHz with TSMC 40nm technology, which is superior to previous designs.