基于网络压缩的高性能RNN加速器设计

Wentao Zhu, Yuhao Sun, Zeyu Shen, Haichuan Yang, Yu Gong, Bo Liu
{"title":"基于网络压缩的高性能RNN加速器设计","authors":"Wentao Zhu, Yuhao Sun, Zeyu Shen, Haichuan Yang, Yu Gong, Bo Liu","doi":"10.1109/ICCS51219.2020.9336599","DOIUrl":null,"url":null,"abstract":"As the size of the neural networks expands, the computation and storage consumption of recurrent neural networks is increasing. To solve this problem, this paper proposes a recurrent neural network accelerator which can reduce computation redundancy, memory overhead and energy consumption. A novel network compression method based on pruning and hybgrid quantization is also proposed to reduce computation and memory overhead. Based on the designs above, a precision adaptive approximate calculation based accelerator is designed to achieve high energy efficiency. The experimental results show that under the TSMC 28nm process, when the data bit width is 4bit and the working voltage is 0.8V, the peak performance of proposed accelerator is not reduced, the power consumption is 38.4mW, and the energy efficiency is 2.7TOPs/W. The energy efficiency of proposed accelerator is 2.5 times more than that of the state-of-the-art recurrent neural network accelerators.","PeriodicalId":193552,"journal":{"name":"2020 IEEE 2nd International Conference on Circuits and Systems (ICCS)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of High Performance RNN Accelerator Based on Network Compression\",\"authors\":\"Wentao Zhu, Yuhao Sun, Zeyu Shen, Haichuan Yang, Yu Gong, Bo Liu\",\"doi\":\"10.1109/ICCS51219.2020.9336599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the size of the neural networks expands, the computation and storage consumption of recurrent neural networks is increasing. To solve this problem, this paper proposes a recurrent neural network accelerator which can reduce computation redundancy, memory overhead and energy consumption. A novel network compression method based on pruning and hybgrid quantization is also proposed to reduce computation and memory overhead. Based on the designs above, a precision adaptive approximate calculation based accelerator is designed to achieve high energy efficiency. The experimental results show that under the TSMC 28nm process, when the data bit width is 4bit and the working voltage is 0.8V, the peak performance of proposed accelerator is not reduced, the power consumption is 38.4mW, and the energy efficiency is 2.7TOPs/W. The energy efficiency of proposed accelerator is 2.5 times more than that of the state-of-the-art recurrent neural network accelerators.\",\"PeriodicalId\":193552,\"journal\":{\"name\":\"2020 IEEE 2nd International Conference on Circuits and Systems (ICCS)\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 2nd International Conference on Circuits and Systems (ICCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCS51219.2020.9336599\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 2nd International Conference on Circuits and Systems (ICCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCS51219.2020.9336599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着神经网络规模的扩大,递归神经网络的计算量和存储消耗也在不断增加。为了解决这一问题,本文提出了一种循环神经网络加速器,可以减少计算冗余、内存开销和能量消耗。为了减少计算量和内存开销,提出了一种基于剪枝和混合网格量化的网络压缩方法。在此基础上,设计了一种基于精确自适应近似计算的加速器,实现了加速器的高能效。实验结果表明,在TSMC 28nm工艺下,当数据位宽为4bit,工作电压为0.8V时,所提出的加速器峰值性能不降低,功耗为38.4mW,能效为2.7TOPs/W。该加速器的能量效率是目前最先进的循环神经网络加速器的2.5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Design of High Performance RNN Accelerator Based on Network Compression
As the size of the neural networks expands, the computation and storage consumption of recurrent neural networks is increasing. To solve this problem, this paper proposes a recurrent neural network accelerator which can reduce computation redundancy, memory overhead and energy consumption. A novel network compression method based on pruning and hybgrid quantization is also proposed to reduce computation and memory overhead. Based on the designs above, a precision adaptive approximate calculation based accelerator is designed to achieve high energy efficiency. The experimental results show that under the TSMC 28nm process, when the data bit width is 4bit and the working voltage is 0.8V, the peak performance of proposed accelerator is not reduced, the power consumption is 38.4mW, and the energy efficiency is 2.7TOPs/W. The energy efficiency of proposed accelerator is 2.5 times more than that of the state-of-the-art recurrent neural network accelerators.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信