Yan-Cheng Guo, Wei-Tien Lin, T. Hou, Tian-Sheuan Chang
{"title":"FPCIM:用于边缘人工智能设备的全并行鲁棒ReRAM CIM处理器","authors":"Yan-Cheng Guo, Wei-Tien Lin, T. Hou, Tian-Sheuan Chang","doi":"10.1109/ISCAS46773.2023.10181402","DOIUrl":null,"url":null,"abstract":"Computing-in-memory (CIM) is popular for deep learning due to its high energy efficiency owing to massive parallelism and low data movement. However, current ReRAM based CIM designs only use partial parallelism since fully parallel CIM could suffer lower model accuracy due to severe nonideal effects. This paper proposes a robust fully-parallel ReRAM-based CIM processor for deep learning. The proposed design exploits the fully-parallel computation of a $1024\\mathrm{x}1024$ array to achieve 110.59 TOPS and reduces nonideal effects with in-ReRAM computing (IRC) training and hybrid digital/IRC design to minimize the accuracy loss with only 1.55%. This design is programmable with a compact CIM-oriented instruction set to support various 2-D convolution neural networks (NN) as well as hybrid digital/IRC designs. The final implementation achieves a 2740.41 TOPS/W energy efficiency at 125MHz with TSMC 40nm technology, which is superior to previous designs.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FPCIM: A Fully-Parallel Robust ReRAM CIM Processor for Edge AI Devices\",\"authors\":\"Yan-Cheng Guo, Wei-Tien Lin, T. Hou, Tian-Sheuan Chang\",\"doi\":\"10.1109/ISCAS46773.2023.10181402\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computing-in-memory (CIM) is popular for deep learning due to its high energy efficiency owing to massive parallelism and low data movement. However, current ReRAM based CIM designs only use partial parallelism since fully parallel CIM could suffer lower model accuracy due to severe nonideal effects. This paper proposes a robust fully-parallel ReRAM-based CIM processor for deep learning. The proposed design exploits the fully-parallel computation of a $1024\\\\mathrm{x}1024$ array to achieve 110.59 TOPS and reduces nonideal effects with in-ReRAM computing (IRC) training and hybrid digital/IRC design to minimize the accuracy loss with only 1.55%. This design is programmable with a compact CIM-oriented instruction set to support various 2-D convolution neural networks (NN) as well as hybrid digital/IRC designs. The final implementation achieves a 2740.41 TOPS/W energy efficiency at 125MHz with TSMC 40nm technology, which is superior to previous designs.\",\"PeriodicalId\":177320,\"journal\":{\"name\":\"2023 IEEE International Symposium on Circuits and Systems (ISCAS)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Symposium on Circuits and Systems (ISCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCAS46773.2023.10181402\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAS46773.2023.10181402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FPCIM: A Fully-Parallel Robust ReRAM CIM Processor for Edge AI Devices
Computing-in-memory (CIM) is popular for deep learning due to its high energy efficiency owing to massive parallelism and low data movement. However, current ReRAM based CIM designs only use partial parallelism since fully parallel CIM could suffer lower model accuracy due to severe nonideal effects. This paper proposes a robust fully-parallel ReRAM-based CIM processor for deep learning. The proposed design exploits the fully-parallel computation of a $1024\mathrm{x}1024$ array to achieve 110.59 TOPS and reduces nonideal effects with in-ReRAM computing (IRC) training and hybrid digital/IRC design to minimize the accuracy loss with only 1.55%. This design is programmable with a compact CIM-oriented instruction set to support various 2-D convolution neural networks (NN) as well as hybrid digital/IRC designs. The final implementation achieves a 2740.41 TOPS/W energy efficiency at 125MHz with TSMC 40nm technology, which is superior to previous designs.