用于深度神经网络加速器的动态精度乘法器

Chen Ding, Y. Huan, Lirong Zheng, Z. Zou
{"title":"用于深度神经网络加速器的动态精度乘法器","authors":"Chen Ding, Y. Huan, Lirong Zheng, Z. Zou","doi":"10.1109/socc49529.2020.9524752","DOIUrl":null,"url":null,"abstract":"The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dynamic Precision Multiplier For Deep Neural Network Accelerators\",\"authors\":\"Chen Ding, Y. Huan, Lirong Zheng, Z. Zou\",\"doi\":\"10.1109/socc49529.2020.9524752\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.\",\"PeriodicalId\":114740,\"journal\":{\"name\":\"2020 IEEE 33rd International System-on-Chip Conference (SOCC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 33rd International System-on-Chip Conference (SOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/socc49529.2020.9524752\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/socc49529.2020.9524752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

动态精度乘法器在深度神经网络加速器中的应用,可以在相同内存带宽限制下,极大地提高系统的数据处理能力。所提出的DPM采用Booth算法和Wallace加法树,支持在运行时并行计算1个16位、2个8位或4个4位的有符号/无符号。通过简化部分产品选择逻辑和混合部分产品选择结构技术,进一步优化DPM,降低功耗,实现高效节能的边缘计算。在FPGA和ASIC流程中对DPM进行了评估,结果表明4位模式消耗的能量最少,为1.34pJ/word。与以往的同类设计相比,在16位和8位模式下分别节省了近22.38%和232.17%的功耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dynamic Precision Multiplier For Deep Neural Network Accelerators
The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信