Ethan G. Weinstock, Yiming Tan, Wantong Li, Shimeng Yu
{"title":"Machine Learning Algorithm Co-Design for a 40 nm RRAM Analog Compute-in-Memory Accelerator","authors":"Ethan G. Weinstock, Yiming Tan, Wantong Li, Shimeng Yu","doi":"10.1109/ORSS58323.2023.10161812","DOIUrl":null,"url":null,"abstract":"Resistive Random-Access Memory (RRAM)-based analog compute-in-memory (CIM) chips can be used to accelerate deep neural network (DNN) inference in low power, resource-constrained edge-devices. In this work, we present software algorithm co-design techniques to optimize DNN inference on RRAM-based CIM accelerators for a hand-written digit recognition workload. Our approach involves three key steps: 1) an integer-based training framework that optimizes raw DNN output for low bit-width networks; 2) a row rename step that leverages low weight variance in low-precision networks to reduce network storage requirements; and 3) a row reorder step that leverages the spatial nature of sparsity-aware input control to reduce the effect of analog noise on error rate. We apply our techniques to train a LeNet-5 model using 1-bit weights and 2-bit activations to 96.66% accuracy on the MNIST dataset, reduce area of the final fully connected layer by 45.24%, and reduce the impact of analog noise in the final layer on inference accuracy by 5.6%.","PeriodicalId":263086,"journal":{"name":"2023 IEEE International Opportunity Research Scholars Symposium (ORSS)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Opportunity Research Scholars Symposium (ORSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ORSS58323.2023.10161812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Resistive Random-Access Memory (RRAM)-based analog compute-in-memory (CIM) chips can be used to accelerate deep neural network (DNN) inference in low power, resource-constrained edge-devices. In this work, we present software algorithm co-design techniques to optimize DNN inference on RRAM-based CIM accelerators for a hand-written digit recognition workload. Our approach involves three key steps: 1) an integer-based training framework that optimizes raw DNN output for low bit-width networks; 2) a row rename step that leverages low weight variance in low-precision networks to reduce network storage requirements; and 3) a row reorder step that leverages the spatial nature of sparsity-aware input control to reduce the effect of analog noise on error rate. We apply our techniques to train a LeNet-5 model using 1-bit weights and 2-bit activations to 96.66% accuracy on the MNIST dataset, reduce area of the final fully connected layer by 45.24%, and reduce the impact of analog noise in the final layer on inference accuracy by 5.6%.