Residue-Net: Multiplication-free Neural Network by In-situ No-loss Migration to Residue Number Systems

2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2021-01-18 DOI:10.1145/3394885.3431541

Sahand Salamat, Sumiran Shubhi, Behnam Khaleghi, T. Simunic

{"title":"Residue-Net: Multiplication-free Neural Network by In-situ No-loss Migration to Residue Number Systems","authors":"Sahand Salamat, Sumiran Shubhi, Behnam Khaleghi, T. Simunic","doi":"10.1145/3394885.3431541","DOIUrl":null,"url":null,"abstract":"Deep eural networks are widely deployed on embedded devices to solve a wide range of problems from edge-sensing to autonomous driving. The accuracy of these networks is usually proportional to their complexity. Quantization of model parameters (i.e., weights) and/or activations to alleviate the complexity of these networks while preserving accuracy is a popular powerful technique. Nonetheless, previous studies have shown that quantization level is limited as the accuracy of the network decreases afterward. We propose Residue-Net, a multiplication-free accelerator for neural networks that uses Residue Number System (RNS) to achieve substantial energy reduction. RNS breaks down the operations to several smaller operations that are simpler to implement. Moreover, Residue-Net replaces the copious of costly multiplications with non-complex, energy-efficient shift and add operations to further simplify the computational complexity of neural networks. To evaluate the efficiency of our proposed accelerator, we compared the performance of Residue-Net with a baseline FPGA implementation of four widely-used networks, viz., LeNet, AlexNet, VGG16, and ResNet-50. When delivering the same performance as the baseline, Residue-Net reduces the area and power (hence energy) respectively by 36% and 23%, on average with no accuracy loss. Leveraging the saved area to accelerate the quantized RNS network through parallelism, Residue-Net improves its throughput by 2.8× and energy by 2.7×.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3394885.3431541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Deep eural networks are widely deployed on embedded devices to solve a wide range of problems from edge-sensing to autonomous driving. The accuracy of these networks is usually proportional to their complexity. Quantization of model parameters (i.e., weights) and/or activations to alleviate the complexity of these networks while preserving accuracy is a popular powerful technique. Nonetheless, previous studies have shown that quantization level is limited as the accuracy of the network decreases afterward. We propose Residue-Net, a multiplication-free accelerator for neural networks that uses Residue Number System (RNS) to achieve substantial energy reduction. RNS breaks down the operations to several smaller operations that are simpler to implement. Moreover, Residue-Net replaces the copious of costly multiplications with non-complex, energy-efficient shift and add operations to further simplify the computational complexity of neural networks. To evaluate the efficiency of our proposed accelerator, we compared the performance of Residue-Net with a baseline FPGA implementation of four widely-used networks, viz., LeNet, AlexNet, VGG16, and ResNet-50. When delivering the same performance as the baseline, Residue-Net reduces the area and power (hence energy) respectively by 36% and 23%, on average with no accuracy loss. Leveraging the saved area to accelerate the quantized RNS network through parallelism, Residue-Net improves its throughput by 2.8× and energy by 2.7×.

查看原文本刊更多论文

残差网络:残差数原位无损失迁移的无乘法神经网络

深度神经网络被广泛应用于嵌入式设备，以解决从边缘感知到自动驾驶等一系列问题。这些网络的准确性通常与它们的复杂性成正比。量化模型参数(即权重)和/或激活，以减轻这些网络的复杂性，同时保持准确性是一种流行的强大技术。然而，先前的研究表明，量化水平是有限的，因为网络的精度下降之后。我们提出残差网络，一个无乘法的神经网络加速器，使用残差数系统(RNS)来实现大量的能量降低。RNS将操作分解为几个更容易实现的小操作。此外，残差网络用不复杂、节能的移位和加法运算取代了大量昂贵的乘法运算，进一步简化了神经网络的计算复杂度。为了评估我们提出的加速器的效率，我们将残留物网的性能与四个广泛使用的网络(即LeNet, AlexNet, VGG16和ResNet-50)的基线FPGA实现进行了比较。在提供与基线相同的性能时，residual - net分别减少了36%和23%的面积和功率(因此能源)，平均没有精度损失。利用节省的面积，通过并行化加速量化RNS网络，使其吞吐量提高2.8倍，能耗提高2.7倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)

自引率

0.00%

发文量