Efficient Implementation of Activation Function on FPGA for Accelerating Neural Networks

Kai Qian, Yinqiu Liu, Zexu Zhang, Kun Wang
{"title":"Efficient Implementation of Activation Function on FPGA for Accelerating Neural Networks","authors":"Kai Qian, Yinqiu Liu, Zexu Zhang, Kun Wang","doi":"10.1109/ISCAS46773.2023.10181406","DOIUrl":null,"url":null,"abstract":"In this paper, we present the Integer Lightweight Softmax (ILS) algorithm for approximating the Softmax activation function. The accurate implementation of Softmax on FPGA can be huge resource-intensive and memory-hungry. Then, we present the implementation of ILS on a Xilinx XCKU040 FPGA to evaluate the effectiveness of ILS. Evaluations on CIFAR 10, CIFAR 100 and ImageNet show that ILS achieves up to $2.47\\times, 40\\times$ and $323\\times$ speedup over CPU implementation, and $4\\times, 63\\times$ and $51\\times$ speedup over GPU implementation, respectively. In comparison to previous FPGA-based Softmax implementations, ILS strikes a better balance between resource consumption and precision accuracy.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAS46773.2023.10181406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we present the Integer Lightweight Softmax (ILS) algorithm for approximating the Softmax activation function. The accurate implementation of Softmax on FPGA can be huge resource-intensive and memory-hungry. Then, we present the implementation of ILS on a Xilinx XCKU040 FPGA to evaluate the effectiveness of ILS. Evaluations on CIFAR 10, CIFAR 100 and ImageNet show that ILS achieves up to $2.47\times, 40\times$ and $323\times$ speedup over CPU implementation, and $4\times, 63\times$ and $51\times$ speedup over GPU implementation, respectively. In comparison to previous FPGA-based Softmax implementations, ILS strikes a better balance between resource consumption and precision accuracy.
加速神经网络激活函数在FPGA上的高效实现
本文提出了一种近似Softmax激活函数的整数轻量级Softmax (ILS)算法。Softmax在FPGA上的精确实现可能是巨大的资源密集型和内存消耗。然后,我们在Xilinx XCKU040 FPGA上实现了ILS,以评估ILS的有效性。在CIFAR 10、CIFAR 100和ImageNet上的评估表明,ILS比CPU实现的加速分别达到2.47倍、40倍和323倍,比GPU实现的加速分别达到4倍、63倍和51倍。与以前基于fpga的Softmax实现相比,ILS在资源消耗和精度精度之间取得了更好的平衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信