SNNAP: Approximate computing on programmable SoCs via neural acceleration

T. Moreau, Mark Wyse, J. Nelson, Adrian Sampson, H. Esmaeilzadeh, L. Ceze, M. Oskin
{"title":"SNNAP: Approximate computing on programmable SoCs via neural acceleration","authors":"T. Moreau, Mark Wyse, J. Nelson, Adrian Sampson, H. Esmaeilzadeh, L. Ceze, M. Oskin","doi":"10.1109/HPCA.2015.7056066","DOIUrl":null,"url":null,"abstract":"Many applications that can take advantage of accelerators are amenable to approximate execution. Past work has shown that neural acceleration is a viable way to accelerate approximate code. In light of the growing availability of on-chip field-programmable gate arrays (FPGAs), this paper explores neural acceleration on off-the-shelf programmable SoCs. We describe the design and implementation of SNNAP, a flexible FPGA-based neural accelerator for approximate programs. SNNAP is designed to work with a compiler workflow that configures the neural network's topology and weights instead of the programmable logic of the FPGA itself. This approach enables effective use of neural acceleration in commercially available devices and accelerates different applications without costly FPGA reconfigurations. No hardware expertise is required to accelerate software with SNNAP, so the effort required can be substantially lower than custom hardware design for an FPGA fabric and possibly even lower than current “C-to-gates” high-level synthesis (HLS) tools. Our measurements on a Xilinx Zynq FPGA show that SNNAP yields a geometric mean of 3.8× speedup (as high as 38.1×) and 2.8× energy savings (as high as 28 x) with less than 10% quality loss across all applications but one. We also compare SNNAP with designs generated by commercial HLS tools and show that SNNAP has similar performance overall, with better resource-normalized throughput on 4 out of 7 benchmarks.","PeriodicalId":6593,"journal":{"name":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","volume":"349 1","pages":"603-614"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"136","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2015.7056066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 136

Abstract

Many applications that can take advantage of accelerators are amenable to approximate execution. Past work has shown that neural acceleration is a viable way to accelerate approximate code. In light of the growing availability of on-chip field-programmable gate arrays (FPGAs), this paper explores neural acceleration on off-the-shelf programmable SoCs. We describe the design and implementation of SNNAP, a flexible FPGA-based neural accelerator for approximate programs. SNNAP is designed to work with a compiler workflow that configures the neural network's topology and weights instead of the programmable logic of the FPGA itself. This approach enables effective use of neural acceleration in commercially available devices and accelerates different applications without costly FPGA reconfigurations. No hardware expertise is required to accelerate software with SNNAP, so the effort required can be substantially lower than custom hardware design for an FPGA fabric and possibly even lower than current “C-to-gates” high-level synthesis (HLS) tools. Our measurements on a Xilinx Zynq FPGA show that SNNAP yields a geometric mean of 3.8× speedup (as high as 38.1×) and 2.8× energy savings (as high as 28 x) with less than 10% quality loss across all applications but one. We also compare SNNAP with designs generated by commercial HLS tools and show that SNNAP has similar performance overall, with better resource-normalized throughput on 4 out of 7 benchmarks.
通过神经加速对可编程soc进行近似计算
许多可以利用加速器的应用程序都可以近似执行。过去的工作表明,神经加速是加速近似代码的可行方法。鉴于片上现场可编程门阵列(fpga)的可用性越来越高,本文探讨了现成可编程soc上的神经加速。我们描述了SNNAP的设计和实现,SNNAP是一种灵活的基于fpga的近似程序神经加速器。SNNAP被设计为与编译器工作流一起工作,该工作流配置神经网络的拓扑和权重,而不是FPGA本身的可编程逻辑。这种方法可以在商用设备中有效地使用神经加速,并在不需要昂贵的FPGA重新配置的情况下加速不同的应用。使用SNNAP加速软件不需要硬件专业知识,因此所需的工作量大大低于FPGA结构的定制硬件设计,甚至可能低于当前的“C-to-gates”高级综合(HLS)工具。我们在Xilinx Zynq FPGA上的测量表明,SNNAP在除一个应用之外的所有应用中产生3.8倍的几何平均加速(高达38.1倍)和2.8倍的节能(高达28倍),质量损失小于10%。我们还将SNNAP与商业HLS工具生成的设计进行了比较,结果表明SNNAP具有相似的总体性能,在7个基准测试中的4个基准测试中具有更好的资源标准化吞吐量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信