{"title":"A Hybrid-Pipelined Architecture for FPGA-based Binary Weight DenseNet with High Performance-Efficiency","authors":"Shihao Zeng, Yihua Huang","doi":"10.1109/HPEC43674.2020.9286185","DOIUrl":null,"url":null,"abstract":"The DenseNet achieves remarkable performance in various computer vision tasks with much fewer parameters and operations. However, there are few acceleration designs about the DenseNet, due to its dense-connectivity structure. In this paper, we apply the binary weight method on the DenseNet and then propose a hybrid-pipelined architecture for FPGA-based acceleration of the binary weight DenseNet, which can be stored entirely in a chip. To deal with the dense-connectivity, a reusable convolution unit is developed to support conv1×1 and conv3×3 efficiently. Moreover, a theoretical method of system parallelism is proposed to guide the top-level pipelined design for the maximum efficiency. To evaluate the proposed architecture, the binary weight DenseNet-100 model is trained on CIFAR10 dataset and then implemented on VX690T FPGA, at the cost of 4.18% accuracy loss. The experiment demonstrates that our architecture can achieve the throughput of 514 GOPS and 889 FPS at 200MHz, and the performance-efficiency is up to 62.4%, which outperforms the most related works.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The DenseNet achieves remarkable performance in various computer vision tasks with much fewer parameters and operations. However, there are few acceleration designs about the DenseNet, due to its dense-connectivity structure. In this paper, we apply the binary weight method on the DenseNet and then propose a hybrid-pipelined architecture for FPGA-based acceleration of the binary weight DenseNet, which can be stored entirely in a chip. To deal with the dense-connectivity, a reusable convolution unit is developed to support conv1×1 and conv3×3 efficiently. Moreover, a theoretical method of system parallelism is proposed to guide the top-level pipelined design for the maximum efficiency. To evaluate the proposed architecture, the binary weight DenseNet-100 model is trained on CIFAR10 dataset and then implemented on VX690T FPGA, at the cost of 4.18% accuracy loss. The experiment demonstrates that our architecture can achieve the throughput of 514 GOPS and 889 FPS at 200MHz, and the performance-efficiency is up to 62.4%, which outperforms the most related works.