Machine Learning Algorithm Co-Design for a 40 nm RRAM Analog Compute-in-Memory Accelerator

Ethan G. Weinstock, Yiming Tan, Wantong Li, Shimeng Yu
{"title":"Machine Learning Algorithm Co-Design for a 40 nm RRAM Analog Compute-in-Memory Accelerator","authors":"Ethan G. Weinstock, Yiming Tan, Wantong Li, Shimeng Yu","doi":"10.1109/ORSS58323.2023.10161812","DOIUrl":null,"url":null,"abstract":"Resistive Random-Access Memory (RRAM)-based analog compute-in-memory (CIM) chips can be used to accelerate deep neural network (DNN) inference in low power, resource-constrained edge-devices. In this work, we present software algorithm co-design techniques to optimize DNN inference on RRAM-based CIM accelerators for a hand-written digit recognition workload. Our approach involves three key steps: 1) an integer-based training framework that optimizes raw DNN output for low bit-width networks; 2) a row rename step that leverages low weight variance in low-precision networks to reduce network storage requirements; and 3) a row reorder step that leverages the spatial nature of sparsity-aware input control to reduce the effect of analog noise on error rate. We apply our techniques to train a LeNet-5 model using 1-bit weights and 2-bit activations to 96.66% accuracy on the MNIST dataset, reduce area of the final fully connected layer by 45.24%, and reduce the impact of analog noise in the final layer on inference accuracy by 5.6%.","PeriodicalId":263086,"journal":{"name":"2023 IEEE International Opportunity Research Scholars Symposium (ORSS)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Opportunity Research Scholars Symposium (ORSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ORSS58323.2023.10161812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Resistive Random-Access Memory (RRAM)-based analog compute-in-memory (CIM) chips can be used to accelerate deep neural network (DNN) inference in low power, resource-constrained edge-devices. In this work, we present software algorithm co-design techniques to optimize DNN inference on RRAM-based CIM accelerators for a hand-written digit recognition workload. Our approach involves three key steps: 1) an integer-based training framework that optimizes raw DNN output for low bit-width networks; 2) a row rename step that leverages low weight variance in low-precision networks to reduce network storage requirements; and 3) a row reorder step that leverages the spatial nature of sparsity-aware input control to reduce the effect of analog noise on error rate. We apply our techniques to train a LeNet-5 model using 1-bit weights and 2-bit activations to 96.66% accuracy on the MNIST dataset, reduce area of the final fully connected layer by 45.24%, and reduce the impact of analog noise in the final layer on inference accuracy by 5.6%.
40 nm RRAM模拟内存中计算加速器的机器学习算法协同设计
基于电阻式随机存取存储器(RRAM)的模拟内存计算(CIM)芯片可用于在低功耗、资源受限的边缘设备中加速深度神经网络(DNN)推理。在这项工作中,我们提出了软件算法协同设计技术,以优化基于rram的CIM加速器上的DNN推理,用于手写数字识别工作负载。我们的方法包括三个关键步骤:1)基于整数的训练框架,优化低位宽网络的原始DNN输出;2)行重命名步骤,利用低精度网络中的低权重方差来降低网络存储需求;3)利用稀疏感知输入控制的空间特性来降低模拟噪声对错误率的影响的行重排序步骤。我们利用我们的技术在MNIST数据集上使用1位权重和2位激活训练LeNet-5模型,准确率达到96.66%,最终完全连接层的面积减少了45.24%,最终层模拟噪声对推理精度的影响减少了5.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信