{"title":"FPGA Hardware Implementation of the Yolo Subclass Convolutional Neural Network Model in Computer Vision Systems","authors":"N. G. Markov, I. V. Zoev, E. Mytsko","doi":"10.1109/SIBCON56144.2022.10003015","DOIUrl":null,"url":null,"abstract":"The computing unit (CU) of the computer vision system (CVS) has been developed based on a system on a chip (SOC) with the Xilinx Field-Programmable Gate Array (FPGA). A new model of a convolutional neural network (CNN) tiny-YOLO-Inception-ResNet2 related to neural networks of the YOLO subclass was proposed. A feature of this model is the presence of two Inception-ResNet modules. The weight coefficients of the trained software-implemented CNN of the same model were used. Images from the Pascal VOC 2007 dataset were used to train and test these CNN models. The research of the CU effectiveness was carried out. Firstly, we study the detection time of one image which spent in each layer of the CNN model and of the whole model depending on the number of universal computing units in the CU. Also, the study of the detecting objects accuracy depending on the bit depth (16 or 32 bits) of floating-point numbers was carried out. It is concluded that it is necessary to perform calculations using 32-bit floating-point numbers. It is shown that the power consumption of the CU did not exceed 12 Watts during all experiments.","PeriodicalId":265523,"journal":{"name":"2022 International Siberian Conference on Control and Communications (SIBCON)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Siberian Conference on Control and Communications (SIBCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIBCON56144.2022.10003015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The computing unit (CU) of the computer vision system (CVS) has been developed based on a system on a chip (SOC) with the Xilinx Field-Programmable Gate Array (FPGA). A new model of a convolutional neural network (CNN) tiny-YOLO-Inception-ResNet2 related to neural networks of the YOLO subclass was proposed. A feature of this model is the presence of two Inception-ResNet modules. The weight coefficients of the trained software-implemented CNN of the same model were used. Images from the Pascal VOC 2007 dataset were used to train and test these CNN models. The research of the CU effectiveness was carried out. Firstly, we study the detection time of one image which spent in each layer of the CNN model and of the whole model depending on the number of universal computing units in the CU. Also, the study of the detecting objects accuracy depending on the bit depth (16 or 32 bits) of floating-point numbers was carried out. It is concluded that it is necessary to perform calculations using 32-bit floating-point numbers. It is shown that the power consumption of the CU did not exceed 12 Watts during all experiments.