Efficient Hardware Acceleration of CNNs using Logarithmic Data Representation with Arbitrary log-base

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) Pub Date : 2018-11-05 DOI:10.1145/3240765.3240803

Sebastian Vogel, Mengyu Liang, A. Guntoro, W. Stechele, G. Ascheid

{"title":"Efficient Hardware Acceleration of CNNs using Logarithmic Data Representation with Arbitrary log-base","authors":"Sebastian Vogel, Mengyu Liang, A. Guntoro, W. Stechele, G. Ascheid","doi":"10.1145/3240765.3240803","DOIUrl":null,"url":null,"abstract":"Efficient acceleration of Deep Neural Networks is a manifold task. In order to save memory requirements and reduce energy consumption we propose the use of dedicated accelerators with novel arithmetic processing elements which use bit shifts instead of multipliers. While a regular power-of-2 quantization scheme allows for multiplierless computation of multiply-accumulate-operations, it suffers from high accuracy losses in neural networks. Therefore, we evaluate the use of powers-of-arbitrary-log-bases and confirmed their suitability for quantization of pre-trained neural networks. The presented method works without retraining of the neural network and therefore is suitable for applications in which no labeled training data is available. In order to verify our proposed method, we implement the log-based processing elements into a neural network accelerator on an FPGA. The hardware efficiency is evaluated in terms of FPGA utilization and energy requirements in comparison to regular 8-bit-fixed-point multiplier based acceleration. Using this approach hardware resources are minimized and power consumption is reduced by 22.3%.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240765.3240803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 45

Abstract

Efficient acceleration of Deep Neural Networks is a manifold task. In order to save memory requirements and reduce energy consumption we propose the use of dedicated accelerators with novel arithmetic processing elements which use bit shifts instead of multipliers. While a regular power-of-2 quantization scheme allows for multiplierless computation of multiply-accumulate-operations, it suffers from high accuracy losses in neural networks. Therefore, we evaluate the use of powers-of-arbitrary-log-bases and confirmed their suitability for quantization of pre-trained neural networks. The presented method works without retraining of the neural network and therefore is suitable for applications in which no labeled training data is available. In order to verify our proposed method, we implement the log-based processing elements into a neural network accelerator on an FPGA. The hardware efficiency is evaluated in terms of FPGA utilization and energy requirements in comparison to regular 8-bit-fixed-point multiplier based acceleration. Using this approach hardware resources are minimized and power consumption is reduced by 22.3%.

查看原文本刊更多论文

基于任意对数基的对数数据表示cnn的高效硬件加速

深度神经网络的有效加速是一项多方面的任务。为了节省内存需求和降低能耗，我们提出了使用专用加速器的新算法处理元素，它使用位移位而不是乘法器。虽然常规的2次幂量化方案允许乘-累加运算的无乘数计算，但它在神经网络中存在高精度损失。因此，我们评估了任意对数基幂的使用，并确认了它们对预训练神经网络量化的适用性。该方法不需要对神经网络进行再训练，因此适用于没有标记训练数据的应用。为了验证我们提出的方法，我们在FPGA上将基于日志的处理元素实现到神经网络加速器中。与常规的基于8位定点乘法器的加速相比，硬件效率是根据FPGA利用率和能量需求来评估的。使用这种方法，硬件资源被最小化，功耗降低了22.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

自引率

0.00%

发文量