研究进展:面向神经网络节能的近STT-MRAM处理架构

Yueting Li, B. Zhao, Xinyi Xu, Yundong Zhang, Jun Wang, Weisheng Zhao
{"title":"研究进展:面向神经网络节能的近STT-MRAM处理架构","authors":"Yueting Li, B. Zhao, Xinyi Xu, Yundong Zhang, Jun Wang, Weisheng Zhao","doi":"10.1109/CODES-ISSS55005.2022.00013","DOIUrl":null,"url":null,"abstract":"The size of parameters in artificial neural network (NN) applications grows quickly from a handful to the GB-level. The data transmission poses a key challenge for NN, and either neuron is removed or data compression reduces pressure on memory access but cannot successfully decrease data traffic. Therefore, we propose the near spin-transfer-torque magnetic random processing architecture for developing energy-efficient NNs. Our approach provides system architects with a preliminary scheme to obtain real-time transmission that near memory controller directly compresses non-zero elements, and encodes the corresponding index depending on the kernel size. Furthermore, it adjusts the number of multiplication accumulators and avoids unnecessary hardware overheads during computation. The preliminary experimental results demonstrated this design verified with weights that currently achieve up to 3.05x speedup and 29.6% power compared with the unoptimized one.","PeriodicalId":129167,"journal":{"name":"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Work-in-Progress: Toward Energy-efficient Near STT-MRAM Processing Architecture for Neural Networks\",\"authors\":\"Yueting Li, B. Zhao, Xinyi Xu, Yundong Zhang, Jun Wang, Weisheng Zhao\",\"doi\":\"10.1109/CODES-ISSS55005.2022.00013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The size of parameters in artificial neural network (NN) applications grows quickly from a handful to the GB-level. The data transmission poses a key challenge for NN, and either neuron is removed or data compression reduces pressure on memory access but cannot successfully decrease data traffic. Therefore, we propose the near spin-transfer-torque magnetic random processing architecture for developing energy-efficient NNs. Our approach provides system architects with a preliminary scheme to obtain real-time transmission that near memory controller directly compresses non-zero elements, and encodes the corresponding index depending on the kernel size. Furthermore, it adjusts the number of multiplication accumulators and avoids unnecessary hardware overheads during computation. The preliminary experimental results demonstrated this design verified with weights that currently achieve up to 3.05x speedup and 29.6% power compared with the unoptimized one.\",\"PeriodicalId\":129167,\"journal\":{\"name\":\"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CODES-ISSS55005.2022.00013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CODES-ISSS55005.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在人工神经网络(NN)应用中,参数的大小从少量迅速增长到gb级。数据传输是神经网络面临的一个关键挑战,无论是去除神经元还是压缩数据,都可以减少内存访问的压力,但都不能成功地减少数据流量。因此,我们提出了近自旋-传递-转矩磁随机处理架构,用于开发节能神经网络。我们的方法为系统架构师提供了一个获得实时传输的初步方案,即近内存控制器直接压缩非零元素,并根据内核大小编码相应的索引。此外,它调整乘法累加器的数量,避免了计算过程中不必要的硬件开销。初步实验结果表明,与未优化的设计相比,该设计在重量上得到了验证,目前实现了高达3.05倍的加速和29.6%的功耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Work-in-Progress: Toward Energy-efficient Near STT-MRAM Processing Architecture for Neural Networks
The size of parameters in artificial neural network (NN) applications grows quickly from a handful to the GB-level. The data transmission poses a key challenge for NN, and either neuron is removed or data compression reduces pressure on memory access but cannot successfully decrease data traffic. Therefore, we propose the near spin-transfer-torque magnetic random processing architecture for developing energy-efficient NNs. Our approach provides system architects with a preliminary scheme to obtain real-time transmission that near memory controller directly compresses non-zero elements, and encodes the corresponding index depending on the kernel size. Furthermore, it adjusts the number of multiplication accumulators and avoids unnecessary hardware overheads during computation. The preliminary experimental results demonstrated this design verified with weights that currently achieve up to 3.05x speedup and 29.6% power compared with the unoptimized one.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信