{"title":"基于FPGA的柔性结构CNN加速器","authors":"Dan Shan, Guotao Cong, W. Lu","doi":"10.1109/ICCIA49625.2020.00047","DOIUrl":null,"url":null,"abstract":"Most of the existing convolutional neural networks (CNNs) are based on PC software, which cannot meet the real-time, low power and miniaturization requirements of the systems. In this paper, a CNN accelerator with flexible structure based on Field-Programmable Gate Array (FPGA) is proposed to achieve recognition of MNIST handwritten numeric characters. The system adopts deep pipeline processing and optimizes inter-layer and intra-layer parallelism from two levels of coarse and fine granularity. In view of the similarity of convolution structure, this design adopts structured circuit, which can easily expand the number of layers and neurons. The classification throughput and inter-layer data throughput capability can be improved by rationally organizing the internal memory resources of the FPGA. Compared with the general CPU, it achieves 3 times acceleration at 50MHz frequency, while the power consumption is only 2% of the CPU. Finally performance and power consumption are compared with other accelerators by VGG16.","PeriodicalId":237536,"journal":{"name":"2020 5th International Conference on Computational Intelligence and Applications (ICCIA)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A CNN Accelerator on FPGA with a Flexible Structure\",\"authors\":\"Dan Shan, Guotao Cong, W. Lu\",\"doi\":\"10.1109/ICCIA49625.2020.00047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of the existing convolutional neural networks (CNNs) are based on PC software, which cannot meet the real-time, low power and miniaturization requirements of the systems. In this paper, a CNN accelerator with flexible structure based on Field-Programmable Gate Array (FPGA) is proposed to achieve recognition of MNIST handwritten numeric characters. The system adopts deep pipeline processing and optimizes inter-layer and intra-layer parallelism from two levels of coarse and fine granularity. In view of the similarity of convolution structure, this design adopts structured circuit, which can easily expand the number of layers and neurons. The classification throughput and inter-layer data throughput capability can be improved by rationally organizing the internal memory resources of the FPGA. Compared with the general CPU, it achieves 3 times acceleration at 50MHz frequency, while the power consumption is only 2% of the CPU. Finally performance and power consumption are compared with other accelerators by VGG16.\",\"PeriodicalId\":237536,\"journal\":{\"name\":\"2020 5th International Conference on Computational Intelligence and Applications (ICCIA)\",\"volume\":\"111 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Computational Intelligence and Applications (ICCIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIA49625.2020.00047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Computational Intelligence and Applications (ICCIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIA49625.2020.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A CNN Accelerator on FPGA with a Flexible Structure
Most of the existing convolutional neural networks (CNNs) are based on PC software, which cannot meet the real-time, low power and miniaturization requirements of the systems. In this paper, a CNN accelerator with flexible structure based on Field-Programmable Gate Array (FPGA) is proposed to achieve recognition of MNIST handwritten numeric characters. The system adopts deep pipeline processing and optimizes inter-layer and intra-layer parallelism from two levels of coarse and fine granularity. In view of the similarity of convolution structure, this design adopts structured circuit, which can easily expand the number of layers and neurons. The classification throughput and inter-layer data throughput capability can be improved by rationally organizing the internal memory resources of the FPGA. Compared with the general CPU, it achieves 3 times acceleration at 50MHz frequency, while the power consumption is only 2% of the CPU. Finally performance and power consumption are compared with other accelerators by VGG16.