A Multiplier-Free RNS-Based CNN Accelerator Exploiting Bit-Level Sparsity

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Vasilis Sakellariou;Vassilis Paliouras;Ioannis Kouretas;Hani Saleh;Thanos Stouraitis
{"title":"A Multiplier-Free RNS-Based CNN Accelerator Exploiting Bit-Level Sparsity","authors":"Vasilis Sakellariou;Vassilis Paliouras;Ioannis Kouretas;Hani Saleh;Thanos Stouraitis","doi":"10.1109/TETC.2023.3301590","DOIUrl":null,"url":null,"abstract":"In this work, a Residue Numbering System (RNS)-based Convolutional Neural Network (CNN) accelerator utilizing a multiplier-free distributed-arithmetic Processing Element (PE) is proposed. A method for maximizing the utilization of the arithmetic hardware resources is presented. It leads to an increase of the system's throughput, by exploiting bit-level sparsity within the weight vectors. The proposed PE design takes advantage of the properties of RNS and Canonical Signed Digit (CSD) encoding to achieve higher energy efficiency and effective processing rate, without requiring any compression mechanism or introducing any approximation. An extensive design space exploration for various parameters (RNS base, PE micro-architecture, encoding) using analytical models as well as experimental results from CNN benchmarks is conducted and the various trade-offs are analyzed. A complete end-to-end RNS accelerator is developed based on the proposed PE. The introduced accelerator is compared to traditional binary and RNS counterparts as well as to other state-of-the-art systems. Implementation results in a 22-nm process show that the proposed PE can lead to \n<inline-formula><tex-math>$1.85\\times$</tex-math></inline-formula>\n and \n<inline-formula><tex-math>$1.54\\times$</tex-math></inline-formula>\n more energy-efficient processing compared to binary and conventional RNS, respectively, with a \n<inline-formula><tex-math>$1.88\\times$</tex-math></inline-formula>\n maximum increase of effective throughput for the employed benchmarks. Compared to a state-of-the-art, all-digital, RNS-based system, the proposed accelerator is \n<inline-formula><tex-math>$8.87\\times$</tex-math></inline-formula>\n and \n<inline-formula><tex-math>$1.11\\times$</tex-math></inline-formula>\n more energy- and area-efficient, respectively.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":null,"pages":null},"PeriodicalIF":5.1000,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10214485/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In this work, a Residue Numbering System (RNS)-based Convolutional Neural Network (CNN) accelerator utilizing a multiplier-free distributed-arithmetic Processing Element (PE) is proposed. A method for maximizing the utilization of the arithmetic hardware resources is presented. It leads to an increase of the system's throughput, by exploiting bit-level sparsity within the weight vectors. The proposed PE design takes advantage of the properties of RNS and Canonical Signed Digit (CSD) encoding to achieve higher energy efficiency and effective processing rate, without requiring any compression mechanism or introducing any approximation. An extensive design space exploration for various parameters (RNS base, PE micro-architecture, encoding) using analytical models as well as experimental results from CNN benchmarks is conducted and the various trade-offs are analyzed. A complete end-to-end RNS accelerator is developed based on the proposed PE. The introduced accelerator is compared to traditional binary and RNS counterparts as well as to other state-of-the-art systems. Implementation results in a 22-nm process show that the proposed PE can lead to $1.85\times$ and $1.54\times$ more energy-efficient processing compared to binary and conventional RNS, respectively, with a $1.88\times$ maximum increase of effective throughput for the employed benchmarks. Compared to a state-of-the-art, all-digital, RNS-based system, the proposed accelerator is $8.87\times$ and $1.11\times$ more energy- and area-efficient, respectively.
利用位级稀疏性的无乘法器 RNS 型 CNN 加速器
本研究提出了一种基于残差编码系统(RNS)的卷积神经网络(CNN)加速器,该加速器采用了无乘法器分布式算术处理元件(PE)。它提出了一种最大化算术硬件资源利用率的方法。通过利用权重向量中的位级稀疏性,该方法提高了系统的吞吐量。拟议的 PE 设计利用了 RNS 和 Canonical Signed Digit (CSD) 编码的特性,实现了更高的能效和有效处理率,而不需要任何压缩机制或引入任何近似值。利用分析模型和 CNN 基准的实验结果,对各种参数(RNS 基础、PE 微体系结构、编码)进行了广泛的设计空间探索,并对各种权衡进行了分析。基于所提出的 PE,开发了一个完整的端到端 RNS 加速器。将引入的加速器与传统的二进制和 RNS 对应系统以及其他最先进的系统进行了比较。在 22 纳米工艺中的实现结果表明,与二进制和传统 RNS 相比,所提出的 PE 可使能效处理分别提高 1.85 倍和 1.54 倍,所采用基准的有效吞吐量最大提高 1.88 倍。与最先进的全数字 RNS 系统相比,所提出的加速器的能效和面积效率分别提高了 8.87 倍和 1.11 倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Emerging Topics in Computing
IEEE Transactions on Emerging Topics in Computing Computer Science-Computer Science (miscellaneous)
CiteScore
12.10
自引率
5.10%
发文量
113
期刊介绍: IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信