Multi-Precision Deep Neural Network Acceleration on FPGAs

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2022-01-17 DOI:10.1109/asp-dac52403.2022.9712485

Negar Neda, Salim Ullah, A. Ghanbari, H. Mahdiani, M. Modarressi, Akash Kumar

{"title":"Multi-Precision Deep Neural Network Acceleration on FPGAs","authors":"Negar Neda, Salim Ullah, A. Ghanbari, H. Mahdiani, M. Modarressi, Akash Kumar","doi":"10.1109/asp-dac52403.2022.9712485","DOIUrl":null,"url":null,"abstract":"Quantization is a promising approach to reduce the computational load of neural networks. The minimum bit-width that preserves the original accuracy varies significantly across different neural networks and even across different layers of a single neural network. Most existing designs over-provision neural network accelerators with sufficient bit-width to preserve the required accuracy across a wide range of neural networks. In this paper, we present mpDNN, a multi-precision multiplier with dynamically adjustable bit-width for deep neural network acceleration. The design supports run-time splitting an arithmetic operator into multiple independent operators with smaller bit-width, effectively increasing throughput when lower precision is required. The proposed architecture is designed for FPGAs, in that the multipliers and bit-width adjustment mechanism are optimized for the LUT-based structure of FPGAs. Experimental results show that by enabling run-time precision adjustment, mpDNN can offer 3-15x improvement in throughput.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/asp-dac52403.2022.9712485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Quantization is a promising approach to reduce the computational load of neural networks. The minimum bit-width that preserves the original accuracy varies significantly across different neural networks and even across different layers of a single neural network. Most existing designs over-provision neural network accelerators with sufficient bit-width to preserve the required accuracy across a wide range of neural networks. In this paper, we present mpDNN, a multi-precision multiplier with dynamically adjustable bit-width for deep neural network acceleration. The design supports run-time splitting an arithmetic operator into multiple independent operators with smaller bit-width, effectively increasing throughput when lower precision is required. The proposed architecture is designed for FPGAs, in that the multipliers and bit-width adjustment mechanism are optimized for the LUT-based structure of FPGAs. Experimental results show that by enabling run-time precision adjustment, mpDNN can offer 3-15x improvement in throughput.

查看原文本刊更多论文

fpga上的多精度深度神经网络加速

量化是减少神经网络计算量的一种很有前途的方法。在不同的神经网络中，甚至在单个神经网络的不同层之间，保持原始精度的最小位宽度变化很大。大多数现有的设计都过度提供了足够的比特宽度的神经网络加速器，以在广泛的神经网络中保持所需的精度。本文提出了一种用于深度神经网络加速的多精度乘法器mpDNN，它具有动态可调的位宽。该设计支持在运行时将算术运算符拆分为多个具有较小位宽的独立运算符，在需要较低精度时有效地提高吞吐量。该架构针对fpga设计，针对基于lut的fpga结构对乘法器和位宽调整机制进行了优化。实验结果表明，通过启用运行时精度调整，mpDNN可以将吞吐量提高3-15倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)

自引率

0.00%

发文量