Exploration and Generation of Efficient FPGA-based Deep Neural Network Accelerators

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI:10.1109/SiPS52927.2021.00030

Nermine Ali, Jean-Marc Philippe, Benoît Tain, P. Coussy

{"title":"Exploration and Generation of Efficient FPGA-based Deep Neural Network Accelerators","authors":"Nermine Ali, Jean-Marc Philippe, Benoît Tain, P. Coussy","doi":"10.1109/SiPS52927.2021.00030","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) have emerged as an answer to next-generation applications such as complex image recognition and object detection. Embedding such compute-intensive and memory-hungry algorithms on edge systems will lead to smarter high-value applications. However, the algorithmic innovations in the CNN field leave the hardware accelerators one step behind. Reconfigurable hardware (e.g. FPGAs) allows designing custom accelerators adapted to new algorithms. Furthermore, new design approaches such as high-level synthesis (HLS) enable to generate RTL code based on high-level function descriptions. This paper presents a high-level CNN accelerator generation framework for FPGAs. A first phase of the framework characterizes CNN descriptions using hardware-aware metrics. These metrics then drive a hardware generation phase which builds the proper C source code implementation for each layer of the network. Finally, an HLS tool outputs the synthesizable RTL code of the accelerator. This approach aims at reducing the gap between the evolving applications based on artificial intelligence and hardware accelerators, thus reducing time-to-market of new systems.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"520 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS52927.2021.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Convolutional Neural Networks (CNNs) have emerged as an answer to next-generation applications such as complex image recognition and object detection. Embedding such compute-intensive and memory-hungry algorithms on edge systems will lead to smarter high-value applications. However, the algorithmic innovations in the CNN field leave the hardware accelerators one step behind. Reconfigurable hardware (e.g. FPGAs) allows designing custom accelerators adapted to new algorithms. Furthermore, new design approaches such as high-level synthesis (HLS) enable to generate RTL code based on high-level function descriptions. This paper presents a high-level CNN accelerator generation framework for FPGAs. A first phase of the framework characterizes CNN descriptions using hardware-aware metrics. These metrics then drive a hardware generation phase which builds the proper C source code implementation for each layer of the network. Finally, an HLS tool outputs the synthesizable RTL code of the accelerator. This approach aims at reducing the gap between the evolving applications based on artificial intelligence and hardware accelerators, thus reducing time-to-market of new systems.

查看原文本刊更多论文

基于fpga的高效深度神经网络加速器的探索与生成

卷积神经网络(cnn)已成为复杂图像识别和目标检测等下一代应用的答案。在边缘系统中嵌入这种计算密集型和内存密集型算法将导致更智能的高价值应用程序。然而，CNN领域的算法创新让硬件加速器落后了一步。可重构硬件(如fpga)允许设计适应新算法的自定义加速器。此外，新的设计方法，如高级综合(HLS)，使生成基于高级功能描述的RTL代码成为可能。提出了一种用于fpga的高级CNN加速器生成框架。该框架的第一阶段使用硬件感知指标表征CNN描述。然后，这些指标驱动硬件生成阶段，该阶段为网络的每一层构建适当的C源代码实现。最后，HLS工具输出加速器的可合成RTL代码。这种方法旨在减少基于人工智能和硬件加速器的不断发展的应用之间的差距，从而缩短新系统的上市时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Workshop on Signal Processing Systems (SiPS)

自引率

0.00%

发文量