{"title":"Research on FPGA Based Convolutional Neural Network Acceleration Method","authors":"Tan Xiao, Man Tao","doi":"10.1109/ICAICA52286.2021.9498022","DOIUrl":null,"url":null,"abstract":"In recent years, with the continuous breakthrough in the field of algorithms, the computational complexity of current target detection algorithms is getting higher and higher. In the forward inference stage, many practical applications often have low latency and strict power consumption restrictions. How to realize a low-power, low-cost and high-performance target detection platform has gradually attracted attention. Given the current mobile scene's requirements for high performance and low power consumption, hardware acceleration architecture suitable for different CNNs is designed by combining the working principle of CNN and the computing characteristics of FPGA. CNN’s basic operation unit is realized through high-level synthesis technology, including convolution operation unit, pool operation unit, activation function unit, etc. Optimization strategies such as pipeline, dynamic fixed-point quantization, and ping-pong caching are adopted to reduce the use of on-chip and off-chip memory access and storage resources. Finally, two convolutional neural networks with different structures, the LeNet-5 classification network and, the YOLOv2 detection network, are selected for functional verification and performance analysis. The experimental results show that the convolutional neural network FPGA accelerator designed in this paper can provide better performance with fewer resources and power consumption and can efficiently use the hardware resources on the FPGA.","PeriodicalId":121979,"journal":{"name":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICA52286.2021.9498022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In recent years, with the continuous breakthrough in the field of algorithms, the computational complexity of current target detection algorithms is getting higher and higher. In the forward inference stage, many practical applications often have low latency and strict power consumption restrictions. How to realize a low-power, low-cost and high-performance target detection platform has gradually attracted attention. Given the current mobile scene's requirements for high performance and low power consumption, hardware acceleration architecture suitable for different CNNs is designed by combining the working principle of CNN and the computing characteristics of FPGA. CNN’s basic operation unit is realized through high-level synthesis technology, including convolution operation unit, pool operation unit, activation function unit, etc. Optimization strategies such as pipeline, dynamic fixed-point quantization, and ping-pong caching are adopted to reduce the use of on-chip and off-chip memory access and storage resources. Finally, two convolutional neural networks with different structures, the LeNet-5 classification network and, the YOLOv2 detection network, are selected for functional verification and performance analysis. The experimental results show that the convolutional neural network FPGA accelerator designed in this paper can provide better performance with fewer resources and power consumption and can efficiently use the hardware resources on the FPGA.