高速卷积神经网络硬件加速器的FPGA设计

2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES) Pub Date : 2021-10-23 DOI:10.1109/NILES53778.2021.9600555

A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa

{"title":"高速卷积神经网络硬件加速器的FPGA设计","authors":"A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa","doi":"10.1109/NILES53778.2021.9600555","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.","PeriodicalId":249153,"journal":{"name":"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"FPGA Design of High-Speed Convolutional Neural Network Hardware Accelerator\",\"authors\":\"A. A. A. El-Maksoud, Abdallah K. Mohamed, A. Tarek, Amr Adel, A. Eid, Farida Khaled, Fatma Khaled, Ziad Ibrahim, Eman El Mandouh, H. Mostafa\",\"doi\":\"10.1109/NILES53778.2021.9600555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.\",\"PeriodicalId\":249153,\"journal\":{\"name\":\"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NILES53778.2021.9600555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NILES53778.2021.9600555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

卷积神经网络如今变得越来越重要，因为它们使机器能够与周围环境进行交互，这为计算机视觉应用铺平了道路。与gpu和cpu相比，CNN架构的FPGA实现具有更高的速度和更低的功耗。为了在不降低分类精度的前提下加快对SqueezeNet CNN的处理速度，本文提出了一种基于FPGA的高速硬件加速器。为了解决内存瓶颈问题，本文采用了乒乓内存和在设计中部署多个fifo等方法。该架构被构建为一个流水线单元，逐层处理SqueezeNet CNN。在处理卷积层时采用了不同的并行技术来加快层的处理速度。此外，提出的加速器在100MHz频率下分类为248.76 fps，在172mhz频率下分类为427.4 fps。提出的加速器在Virtex-7 FPGA上实现，克服了Geforce RTX 2080Ti GPU和几种SqueezeNet FPGA实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FPGA Design of High-Speed Convolutional Neural Network Hardware Accelerator

Convolutional Neural Networks get increasingly importance nowadays as they enable machines to interact with the surrounding environment, which paves the way for computer vision applications. FPGA implementations of CNN architectures have higher speed and lower power consumption compared to GPUs and CPUs. This paper proposes a high-speed hardware accelerator on FPGA for SqueezeNet CNN to accelerate its processing without decreasing the classification accuracy. Several ideas are applied to solve the memory bottleneck issue such as using Ping-Pong memory and deploying several FIFOs in the design. The architecture is built as a pipelined unit to process SqueezeNet CNN layer by layer. Different parallelism techniques are applied while processing the convolution layers to speedup layers processing. Moreover, the proposed accelerator classifies 248.76 fps at a frequency of 100MHz, and 427.4 fps at a frequency of 172 MHz. The proposed accelerator is implemented on Virtex-7 FPGA, and overcomes Geforce RTX 2080Ti GPU and several SqueezeNet FPGA implementations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES)

自引率

0.00%

发文量