{"title":"CNN-based End-to-end Autonomous Driving on FPGA Using TVM and VTA","authors":"Toshihiro Uetsuki, Y. Okuyama, Jungpil Shin","doi":"10.1109/MCSoC51149.2021.00028","DOIUrl":null,"url":null,"abstract":"This paper presents a method reducing inference time and maintaining inference accuracy in autonomous driving using TVM and Versatile Tensor Accelerator (VTA) on Field Programmable Gate Array (FPGA). We focus on End-to-end deep neural networks (DNNs) that directly calculate throttle and steering values of cars using camera images to realize autonomous driving. This network is highly accurate in that it does not add any artificial features. However, real-time implementation of autonomous driving DNNs in embedded systems is problematic due to the limited computational resources and electric power. To address this problem, we implemented the network on an FPGA using TVM and VTA. We modified the network using TVM to (1) reduce the number of bits in the neural network parameters from float32 to int8, (2) schedule the matrix computation in hardware, and (3) optimize the operators, tensors, and hardware parameters to maximize the performance of the neural network at runtime. We measured inference time and accuracy of the CPU and CPU + FPGA implementations on the same board. The experiment shows that CPU+FPGA reduced the inference time by 61%, with a 1 % decrease in inference accuracy than CPU implementation. We conclude that FPGA implementation of the end-to-end autonomous driving network can reduce the inference time and maintain the inference accuracy.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC51149.2021.00028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper presents a method reducing inference time and maintaining inference accuracy in autonomous driving using TVM and Versatile Tensor Accelerator (VTA) on Field Programmable Gate Array (FPGA). We focus on End-to-end deep neural networks (DNNs) that directly calculate throttle and steering values of cars using camera images to realize autonomous driving. This network is highly accurate in that it does not add any artificial features. However, real-time implementation of autonomous driving DNNs in embedded systems is problematic due to the limited computational resources and electric power. To address this problem, we implemented the network on an FPGA using TVM and VTA. We modified the network using TVM to (1) reduce the number of bits in the neural network parameters from float32 to int8, (2) schedule the matrix computation in hardware, and (3) optimize the operators, tensors, and hardware parameters to maximize the performance of the neural network at runtime. We measured inference time and accuracy of the CPU and CPU + FPGA implementations on the same board. The experiment shows that CPU+FPGA reduced the inference time by 61%, with a 1 % decrease in inference accuracy than CPU implementation. We conclude that FPGA implementation of the end-to-end autonomous driving network can reduce the inference time and maintain the inference accuracy.