加速混合深度神经网络激活函数的选择- FPGA实现

2021 IEEE Region 10 Symposium (TENSYMP) Pub Date : 2021-08-23 DOI:10.1109/TENSYMP52854.2021.9551000

S. Waseem, Alavala Venkata Suraj, S. Roy

{"title":"加速混合深度神经网络激活函数的选择- FPGA实现","authors":"S. Waseem, Alavala Venkata Suraj, S. Roy","doi":"10.1109/TENSYMP52854.2021.9551000","DOIUrl":null,"url":null,"abstract":"Much of the work in literature about hardware implementations of deep neural networks illustrates the multiplication of input signal with weights and summing up the data. Work in this paper is focused on the hardware implementation of non-linear functions (activation functions), along with hardware rendition of the automatic selection of activation function for each layer in the neural network in order to increase the accuracy. We have used Field Programmable Gate Array (FPGA) based hardware development platform to add in the advantages of the power efficiency and edge deployment. Our novel hardware design modules to this extent are capable of accelerating the entire process of activation function selection and activation function output generation along with its derivative. The tabulated results in this paper, for power and resource utilization generated through Xilinx® Vivado platform by targeting our design modules towards Avnet® Ultra96 v2 Evaluation board prove the sheer hardware novelty in terms of less hardware foot print and energy efficiency in comparison to the GPU (Graphics Processing Unit) and CPU (Central Processing Unit) based executions along with FPGA based implementations of few activation functions reported in literature.","PeriodicalId":137485,"journal":{"name":"2021 IEEE Region 10 Symposium (TENSYMP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Accelerating the Activation Function Selection for Hybrid Deep Neural Networks – FPGA Implementation\",\"authors\":\"S. Waseem, Alavala Venkata Suraj, S. Roy\",\"doi\":\"10.1109/TENSYMP52854.2021.9551000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Much of the work in literature about hardware implementations of deep neural networks illustrates the multiplication of input signal with weights and summing up the data. Work in this paper is focused on the hardware implementation of non-linear functions (activation functions), along with hardware rendition of the automatic selection of activation function for each layer in the neural network in order to increase the accuracy. We have used Field Programmable Gate Array (FPGA) based hardware development platform to add in the advantages of the power efficiency and edge deployment. Our novel hardware design modules to this extent are capable of accelerating the entire process of activation function selection and activation function output generation along with its derivative. The tabulated results in this paper, for power and resource utilization generated through Xilinx® Vivado platform by targeting our design modules towards Avnet® Ultra96 v2 Evaluation board prove the sheer hardware novelty in terms of less hardware foot print and energy efficiency in comparison to the GPU (Graphics Processing Unit) and CPU (Central Processing Unit) based executions along with FPGA based implementations of few activation functions reported in literature.\",\"PeriodicalId\":137485,\"journal\":{\"name\":\"2021 IEEE Region 10 Symposium (TENSYMP)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Region 10 Symposium (TENSYMP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TENSYMP52854.2021.9551000\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Region 10 Symposium (TENSYMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENSYMP52854.2021.9551000","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

许多关于深度神经网络硬件实现的文献说明了输入信号与权重的乘法和数据的求和。本文的工作重点是非线性函数(激活函数)的硬件实现，以及神经网络中每层激活函数的自动选择的硬件实现，以提高精度。我们采用基于现场可编程门阵列(FPGA)的硬件开发平台，增加了功耗效率和边缘部署的优势。我们新颖的硬件设计模块在这个程度上能够加速激活函数选择和激活函数输出生成的整个过程及其导数。通过针对Avnet®Ultra96 v2评估板的设计模块，通过Xilinx®Vivado平台生成的功率和资源利用率的表格结果证明，与基于GPU(图形处理单元)和CPU(中央处理单元)的执行以及基于FPGA的实现相比，在硬件占地面积和能源效率方面具有纯粹的硬件新颖性，并且在文献中报道了一些激活函数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerating the Activation Function Selection for Hybrid Deep Neural Networks – FPGA Implementation

Much of the work in literature about hardware implementations of deep neural networks illustrates the multiplication of input signal with weights and summing up the data. Work in this paper is focused on the hardware implementation of non-linear functions (activation functions), along with hardware rendition of the automatic selection of activation function for each layer in the neural network in order to increase the accuracy. We have used Field Programmable Gate Array (FPGA) based hardware development platform to add in the advantages of the power efficiency and edge deployment. Our novel hardware design modules to this extent are capable of accelerating the entire process of activation function selection and activation function output generation along with its derivative. The tabulated results in this paper, for power and resource utilization generated through Xilinx® Vivado platform by targeting our design modules towards Avnet® Ultra96 v2 Evaluation board prove the sheer hardware novelty in terms of less hardware foot print and energy efficiency in comparison to the GPU (Graphics Processing Unit) and CPU (Central Processing Unit) based executions along with FPGA based implementations of few activation functions reported in literature.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE Region 10 Symposium (TENSYMP)

自引率

0.00%

发文量