Bit-Offsetter: A Bit-serial DNN Accelerator with Weight-offset MAC for Bit-wise Sparsity Exploitation

Siqi He, Hongyi Zhang, Mengjie Li, Haozhe Zhu, Chixiao Chen, Qi Liu, Xiaoyang Zeng
{"title":"Bit-Offsetter: A Bit-serial DNN Accelerator with Weight-offset MAC for Bit-wise Sparsity Exploitation","authors":"Siqi He, Hongyi Zhang, Mengjie Li, Haozhe Zhu, Chixiao Chen, Qi Liu, Xiaoyang Zeng","doi":"10.1109/AICAS57966.2023.10168618","DOIUrl":null,"url":null,"abstract":"With the rapid evolution of deep neural networks (DNNs), the massive computational burden brings about the difficulty of deploying DNN on edge devices. This situation gives rise to specialized hardware aiming at exploiting the sparsity of DNN parameters. Bit-serial architectures (BSAs) possess great performance potential by leveraging the abundant bit-wise sparsity. However, the distribution of effective bits of weights confines the performance of BSA designs. To improve the efficiency of BSA, we propose a weight-offset multiply-accumulation (MAC) scheme and an associated hardware design called Bit-offsetter in this paper. Weight-offsetting not only significantly boosts bit-wise sparsity but also brings out a more balanced distribution of essential bits. For Bit-offsetter, aside from leveraging the abundant bitwise sparsity induced by weight-offsetting, it’s also equipped with a load-balancing scheduler to reduce idle cycles and mitigate utilization degradation. According to our experiment on a series of DNN models, weight-offsetting can increase bit-wise sparsity for pre-trained weight up to 77.4% on average. The weight-offset MAC scheme associated with Bit-offsetter achieves 3.28×/2.94× speedup/energy efficiency over the baseline.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid evolution of deep neural networks (DNNs), the massive computational burden brings about the difficulty of deploying DNN on edge devices. This situation gives rise to specialized hardware aiming at exploiting the sparsity of DNN parameters. Bit-serial architectures (BSAs) possess great performance potential by leveraging the abundant bit-wise sparsity. However, the distribution of effective bits of weights confines the performance of BSA designs. To improve the efficiency of BSA, we propose a weight-offset multiply-accumulation (MAC) scheme and an associated hardware design called Bit-offsetter in this paper. Weight-offsetting not only significantly boosts bit-wise sparsity but also brings out a more balanced distribution of essential bits. For Bit-offsetter, aside from leveraging the abundant bitwise sparsity induced by weight-offsetting, it’s also equipped with a load-balancing scheduler to reduce idle cycles and mitigate utilization degradation. According to our experiment on a series of DNN models, weight-offsetting can increase bit-wise sparsity for pre-trained weight up to 77.4% on average. The weight-offset MAC scheme associated with Bit-offsetter achieves 3.28×/2.94× speedup/energy efficiency over the baseline.
位偏移器:位串行DNN加速器,具有用于逐位稀疏性开发的权重偏移MAC
随着深度神经网络的快速发展,巨大的计算负担给在边缘设备上部署深度神经网络带来了困难。这种情况产生了专门的硬件,旨在利用深度神经网络参数的稀疏性。位串行体系结构(BSAs)利用了丰富的逐位稀疏性,具有巨大的性能潜力。然而,有效位权重的分布限制了BSA设计的性能。为了提高BSA的效率,本文提出了一种权重偏移乘积累(MAC)方案和一种相关的硬件设计,称为位偏移。权重偏移不仅显著提高了比特稀疏性,而且使基本比特的分布更加均衡。对于位偏移,除了利用由权重偏移引起的丰富的位稀疏性外,它还配备了负载平衡调度器,以减少空闲周期并减轻利用率下降。根据我们对一系列DNN模型的实验,权重偏移可以将预训练权重的按位稀疏度平均提高77.4%。与Bit-offsetter相关的权重偏移MAC方案在基线上实现了3.28×/2.94×的加速/能效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信