{"title":"FPGA Implementation of Object Detection Accelerator Based on Vitis-AI","authors":"Jin Wang, Shenshen Gu","doi":"10.1109/ICIST52614.2021.9440554","DOIUrl":null,"url":null,"abstract":"The emergence of YOLOv3 makes it possible to detect small targets. Due to the characteristics of the YOLO network itself, the YOLOv3 network has exceptionally high requirements for computing power and memory bandwidth and it usually needs to be deployed on a dedicated hardware acceleration platform. FPGAs is a logically reconfigurable hardware chip with substantial advantages in terms of performance and power consumption, so it is a good choice to deploy a deep convolutional network. In the research of this paper, we proposed a reconfigurable YOLOv3 FPGA hardware accelerator based on the AXI bus ARM+FPGA architecture. The YOLOv3 network quantifies through Vitis AI, and a series of operations such as model compression and data pre-processing can save accelerator chips and the access time of external storage. Pipeline operation enables FPGAs to achieve higher throughput. Compared with the GPU implementation of the YOLOv3 model, it is found that the hardware implementation of the FPGA-based YOLOv3 accelerator has lower energy consumption and can achieve higher throughput.","PeriodicalId":371599,"journal":{"name":"2021 11th International Conference on Information Science and Technology (ICIST)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 11th International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST52614.2021.9440554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
The emergence of YOLOv3 makes it possible to detect small targets. Due to the characteristics of the YOLO network itself, the YOLOv3 network has exceptionally high requirements for computing power and memory bandwidth and it usually needs to be deployed on a dedicated hardware acceleration platform. FPGAs is a logically reconfigurable hardware chip with substantial advantages in terms of performance and power consumption, so it is a good choice to deploy a deep convolutional network. In the research of this paper, we proposed a reconfigurable YOLOv3 FPGA hardware accelerator based on the AXI bus ARM+FPGA architecture. The YOLOv3 network quantifies through Vitis AI, and a series of operations such as model compression and data pre-processing can save accelerator chips and the access time of external storage. Pipeline operation enables FPGAs to achieve higher throughput. Compared with the GPU implementation of the YOLOv3 model, it is found that the hardware implementation of the FPGA-based YOLOv3 accelerator has lower energy consumption and can achieve higher throughput.