高速卷积神经网络硬件加速器的FPGA设计

A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa
{"title":"高速卷积神经网络硬件加速器的FPGA设计","authors":"A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa","doi":"10.1109/NILES53778.2021.9600555","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.","PeriodicalId":249153,"journal":{"name":"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"FPGA Design of High-Speed Convolutional Neural Network Hardware Accelerator\",\"authors\":\"A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa\",\"doi\":\"10.1109/NILES53778.2021.9600555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.\",\"PeriodicalId\":249153,\"journal\":{\"name\":\"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NILES53778.2021.9600555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NILES53778.2021.9600555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

卷积神经网络如今变得越来越重要,因为它们使机器能够与周围环境进行交互,这为计算机视觉应用铺平了道路。与gpu和cpu相比,CNN架构的FPGA实现具有更高的速度和更低的功耗。为了在不降低分类精度的前提下加快对SqueezeNet CNN的处理速度,本文提出了一种基于FPGA的高速硬件加速器。为了解决内存瓶颈问题,本文采用了乒乓内存和在设计中部署多个fifo等方法。该架构被构建为一个流水线单元,逐层处理SqueezeNet CNN。在处理卷积层时采用了不同的并行技术来加快层的处理速度。此外,提出的加速器在100MHz频率下分类为248.76 fps,在172mhz频率下分类为427.4 fps。提出的加速器在Virtex-7 FPGA上实现,克服了Geforce RTX 2080Ti GPU和几种SqueezeNet FPGA实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
FPGA Design of High-Speed Convolutional Neural Network Hardware Accelerator
Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信