{"title":"Real-Time Fixed-Point Hardware Accelerator of Convolutional Neural Network on FPGA Based","authors":"Bahadır Özkilbaç, I. Ozbek, T. Karacali","doi":"10.1109/icci54321.2022.9756093","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNN), which have the advantage of automatically detecting the important features of the input data without any human interfere, are widely used in many applications such as face recognition, speech recognition, image classification and object detection. In real-time CNN applications, computation speed is very important as well as accuracy. However, in some applications with high computational complexity, available systems are insufficient to meet the high-speed performance demand at low power consumption. In this study, the design of the CNN accelerator hardware in FPGA is presented to meet the speed demand. In this design, CNN is considered as a streaming interface application. Thus, temporary storage amount and memory latency are reduced. Each layer is designed with maximum parallelism, taking advantage of the FPGA. Because fixed-point number representation has the advantage of low latency, it is preferred in design with negligible sacrifice of accuracy. Thus, forward propagation of a CNN can be executed at high speed in FPGA. In order to compare real-time performance, digit classification application is executed in this hardware designed in FPGA and ARM processor on the same chip. The real-time results show that the application in the hardware designed in the FPGA is 30x faster than the ARM processor.","PeriodicalId":122550,"journal":{"name":"2022 5th International Conference on Computing and Informatics (ICCI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Computing and Informatics (ICCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icci54321.2022.9756093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Convolutional neural networks (CNN), which have the advantage of automatically detecting the important features of the input data without any human interfere, are widely used in many applications such as face recognition, speech recognition, image classification and object detection. In real-time CNN applications, computation speed is very important as well as accuracy. However, in some applications with high computational complexity, available systems are insufficient to meet the high-speed performance demand at low power consumption. In this study, the design of the CNN accelerator hardware in FPGA is presented to meet the speed demand. In this design, CNN is considered as a streaming interface application. Thus, temporary storage amount and memory latency are reduced. Each layer is designed with maximum parallelism, taking advantage of the FPGA. Because fixed-point number representation has the advantage of low latency, it is preferred in design with negligible sacrifice of accuracy. Thus, forward propagation of a CNN can be executed at high speed in FPGA. In order to compare real-time performance, digit classification application is executed in this hardware designed in FPGA and ARM processor on the same chip. The real-time results show that the application in the hardware designed in the FPGA is 30x faster than the ARM processor.