Extended Bit-Plane Compression for Convolutional Neural Network Accelerators

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2018-10-01 DOI:10.1109/AICAS.2019.8771562

L. Cavigelli, L. Benini

引用次数: 18

Abstract

After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained embedded and mobile systems at low cost as well as for pushing the throughput in data centers. This has triggered a wave of research towards specialized hardware accelerators. Their performance is often constrained by I/O bandwidth and the energy consumption is dominated by I/O transfers to off-chip memory. We introduce and evaluate a novel, hardware-friendly compression scheme for the feature maps present within convolutional neural networks. We show that an average compression ratio of 4.4× relative to uncompressed data and a gain of 60% over existing method can be achieved for ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic.

查看原文本刊更多论文

卷积神经网络加速器的扩展位平面压缩

在卷积神经网络在图像分类、目标检测、语音识别等方面取得巨大成功之后，现在越来越多的人需要将这些计算密集型的ML模型以低成本部署在功耗受限的嵌入式和移动系统上，以及推动数据中心的吞吐量。这引发了对专用硬件加速器的研究浪潮。它们的性能通常受到I/O带宽的限制，而能量消耗主要是I/O传输到片外存储器。我们介绍并评估了一种新颖的，硬件友好的压缩方案，用于卷积神经网络中存在的特征映射。我们表明，相对于未压缩数据，ResNet-34的平均压缩比为4.4倍，比现有方法的增益为60%，压缩块需要<300比特的顺序单元和最小的组合逻辑。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)

自引率

0.00%

发文量