flpad - gru:一种灵活、低功耗、加速的门控循环单元神经网络DSP

Ilayda Yaman, Allan Andersen, Lucas Ferreira, Joachirn Rodrigues
{"title":"flpad - gru:一种灵活、低功耗、加速的门控循环单元神经网络DSP","authors":"Ilayda Yaman, Allan Andersen, Lucas Ferreira, Joachirn Rodrigues","doi":"10.1109/SBCCI53441.2021.9529981","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks (RNNs) are efficient for classification of sequential data such as speech and audio due to their high precision on tasks. However, power efficiency, the required memory capacity and bandwidth requirements make them less suitable for battery powered devices. In this work, we introduce FLoPAD-GRU: a system on a chip (SoC) for efficient processing of gated recurrent unit (GRU) networks, that consists of a digital signal processor (DSP), supplemented with an optimized hardware accelerator, which reduces memory accesses and cost. The system is programmable and scalable, which allows for execution of different network sizes. Synthesized in 28 nm CMOS technology, real-time classification is achieved at 4 MHz, with an energy dissipation of 4.1 pJ/classification, an improvement of 15 × compared to a pure DSP realization. The memory requirements are reduced by 75 %, which results in a silicon area of 0.7 mm2for the entire SoC.","PeriodicalId":270661,"journal":{"name":"2021 34th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)","volume":"150 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FLoPAD-GRU: A Flexible, Low Power, Accelerated DSP for Gated Recurrent Unit Neural Network\",\"authors\":\"Ilayda Yaman, Allan Andersen, Lucas Ferreira, Joachirn Rodrigues\",\"doi\":\"10.1109/SBCCI53441.2021.9529981\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recurrent neural networks (RNNs) are efficient for classification of sequential data such as speech and audio due to their high precision on tasks. However, power efficiency, the required memory capacity and bandwidth requirements make them less suitable for battery powered devices. In this work, we introduce FLoPAD-GRU: a system on a chip (SoC) for efficient processing of gated recurrent unit (GRU) networks, that consists of a digital signal processor (DSP), supplemented with an optimized hardware accelerator, which reduces memory accesses and cost. The system is programmable and scalable, which allows for execution of different network sizes. Synthesized in 28 nm CMOS technology, real-time classification is achieved at 4 MHz, with an energy dissipation of 4.1 pJ/classification, an improvement of 15 × compared to a pure DSP realization. The memory requirements are reduced by 75 %, which results in a silicon area of 0.7 mm2for the entire SoC.\",\"PeriodicalId\":270661,\"journal\":{\"name\":\"2021 34th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)\",\"volume\":\"150 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 34th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SBCCI53441.2021.9529981\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 34th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBCCI53441.2021.9529981","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

递归神经网络(rnn)对语音和音频等序列数据的分类具有很高的精度。然而,功率效率、所需的内存容量和带宽要求使它们不太适合电池供电的设备。在这项工作中,我们介绍了FLoPAD-GRU:一种用于有效处理门控循环单元(GRU)网络的片上系统(SoC),它由数字信号处理器(DSP)组成,辅以优化的硬件加速器,可以减少内存访问和成本。该系统是可编程和可扩展的,允许执行不同的网络大小。采用28 nm CMOS技术合成,在4 MHz频率下实现实时分类,能耗为4.1 pJ/分类,比纯DSP实现提高了15倍。内存要求降低了75%,这使得整个SoC的硅面积为0.7 mm2。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
FLoPAD-GRU: A Flexible, Low Power, Accelerated DSP for Gated Recurrent Unit Neural Network
Recurrent neural networks (RNNs) are efficient for classification of sequential data such as speech and audio due to their high precision on tasks. However, power efficiency, the required memory capacity and bandwidth requirements make them less suitable for battery powered devices. In this work, we introduce FLoPAD-GRU: a system on a chip (SoC) for efficient processing of gated recurrent unit (GRU) networks, that consists of a digital signal processor (DSP), supplemented with an optimized hardware accelerator, which reduces memory accesses and cost. The system is programmable and scalable, which allows for execution of different network sizes. Synthesized in 28 nm CMOS technology, real-time classification is achieved at 4 MHz, with an energy dissipation of 4.1 pJ/classification, an improvement of 15 × compared to a pure DSP realization. The memory requirements are reduced by 75 %, which results in a silicon area of 0.7 mm2for the entire SoC.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信