用于深度神经网络加速器的动态精度乘法器

2020 IEEE 33rd International System-on-Chip Conference (SOCC) Pub Date : 2020-09-08 DOI:10.1109/socc49529.2020.9524752

Chen Ding, Y. Huan, Lirong Zheng, Z. Zou

{"title":"用于深度神经网络加速器的动态精度乘法器","authors":"Chen Ding, Y. Huan, Lirong Zheng, Z. Zou","doi":"10.1109/socc49529.2020.9524752","DOIUrl":null,"url":null,"abstract":"The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dynamic Precision Multiplier For Deep Neural Network Accelerators\",\"authors\":\"Chen Ding, Y. Huan, Lirong Zheng, Z. Zou\",\"doi\":\"10.1109/socc49529.2020.9524752\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.\",\"PeriodicalId\":114740,\"journal\":{\"name\":\"2020 IEEE 33rd International System-on-Chip Conference (SOCC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 33rd International System-on-Chip Conference (SOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/socc49529.2020.9524752\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/socc49529.2020.9524752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

动态精度乘法器在深度神经网络加速器中的应用，可以在相同内存带宽限制下，极大地提高系统的数据处理能力。所提出的DPM采用Booth算法和Wallace加法树，支持在运行时并行计算1个16位、2个8位或4个4位的有符号/无符号。通过简化部分产品选择逻辑和混合部分产品选择结构技术，进一步优化DPM，降低功耗，实现高效节能的边缘计算。在FPGA和ASIC流程中对DPM进行了评估，结果表明4位模式消耗的能量最少，为1.34pJ/word。与以往的同类设计相比，在16位和8位模式下分别节省了近22.38%和232.17%的功耗。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamic Precision Multiplier For Deep Neural Network Accelerators

The application of dynamic precision multipliers in the deep neural network accelerators can greatly improve system's data processing capacity under same memory bandwidth limitation. This paper presents a Dynamic Precision Multiplier (DPM) for deep learning accelerators to adapt to light-weight deep learning models with varied precision. The proposed DPM adopts Booth algorithm and Wallace Adder Tree to support parallel computation of signed/unsigned one 16-bit, two 8-bit or four 4-bit at run time. The DPM is further optimized with simplified partial product selection logic and mixed partial product selection structure techniques, reducing power cost for energy-efficient edge computing. The DPM is evaluated in both FPGA and ASIC flow, and the results show that 4-bit mode consumes the least energy among the three modes at 1.34pJ/word. It also saves nearly 22.38% and 232.17% of the power consumption under 16-bit and 8-bit mode respectively when comparing with previous similar designs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 33rd International System-on-Chip Conference (SOCC)

自引率

0.00%

发文量