低功耗嵌入式CNN加速器在低端FPGA上的设计与实现

2019 22nd Euromicro Conference on Digital System Design (DSD) Pub Date : 2019-08-01 DOI:10.1109/DSD.2019.00102

Bahareh Khabbazan, S. Mirzakuchaki

{"title":"低功耗嵌入式CNN加速器在低端FPGA上的设计与实现","authors":"Bahareh Khabbazan, S. Mirzakuchaki","doi":"10.1109/DSD.2019.00102","DOIUrl":null,"url":null,"abstract":"in this paper, an optimized hardware for Convolutional Neural Networks with the purpose of implementation on embedded vision systems is presented. This design method is meant to be implemented with minimum resource consumption on a low-end hardware platform. We propose an architecture on a Z-turn evaluation board featuring a Xilinx Zynq-7000 system on chip (SoC). All computations in this architecture are optimized as 8-bit. Also, the accelerator has a frequency of 160 MHz and power consumption of 1.77 watts which leads to a performance of 40.96GOP/s, using only 134 computing units and 601 KB of internal memory. So we can claim that the acceptable speed and low power and low area consumption of this architecture make it an ideal choice for portable and embedded CNN applications.","PeriodicalId":217233,"journal":{"name":"2019 22nd Euromicro Conference on Digital System Design (DSD)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Design and Implementation of a Low-Power, Embedded CNN Accelerator on a Low-end FPGA\",\"authors\":\"Bahareh Khabbazan, S. Mirzakuchaki\",\"doi\":\"10.1109/DSD.2019.00102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"in this paper, an optimized hardware for Convolutional Neural Networks with the purpose of implementation on embedded vision systems is presented. This design method is meant to be implemented with minimum resource consumption on a low-end hardware platform. We propose an architecture on a Z-turn evaluation board featuring a Xilinx Zynq-7000 system on chip (SoC). All computations in this architecture are optimized as 8-bit. Also, the accelerator has a frequency of 160 MHz and power consumption of 1.77 watts which leads to a performance of 40.96GOP/s, using only 134 computing units and 601 KB of internal memory. So we can claim that the acceptable speed and low power and low area consumption of this architecture make it an ideal choice for portable and embedded CNN applications.\",\"PeriodicalId\":217233,\"journal\":{\"name\":\"2019 22nd Euromicro Conference on Digital System Design (DSD)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 22nd Euromicro Conference on Digital System Design (DSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSD.2019.00102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 22nd Euromicro Conference on Digital System Design (DSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD.2019.00102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

本文提出了一种基于嵌入式视觉系统的卷积神经网络硬件优化方案。这种设计方法旨在以最低的资源消耗在低端硬件平台上实现。我们提出了一种基于Xilinx Zynq-7000片上系统(SoC)的Z-turn评估板架构。该体系结构中的所有计算都优化为8位。此外，加速器的频率为160 MHz，功耗为1.77瓦，仅使用134个计算单元和601 KB内存，性能为40.96GOP/s。因此，我们可以声称，这种架构的可接受的速度、低功耗和低面积消耗使其成为便携式和嵌入式CNN应用的理想选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Design and Implementation of a Low-Power, Embedded CNN Accelerator on a Low-end FPGA

in this paper, an optimized hardware for Convolutional Neural Networks with the purpose of implementation on embedded vision systems is presented. This design method is meant to be implemented with minimum resource consumption on a low-end hardware platform. We propose an architecture on a Z-turn evaluation board featuring a Xilinx Zynq-7000 system on chip (SoC). All computations in this architecture are optimized as 8-bit. Also, the accelerator has a frequency of 160 MHz and power consumption of 1.77 watts which leads to a performance of 40.96GOP/s, using only 134 computing units and 601 KB of internal memory. So we can claim that the acceptable speed and low power and low area consumption of this architecture make it an ideal choice for portable and embedded CNN applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 22nd Euromicro Conference on Digital System Design (DSD)

自引率

0.00%

发文量