iFPNA: A Flexible and Efficient Deep Neural Network Accelerator with a Programmable Data Flow Engine in 28nm CMOS

ESSCIRC 2018 - IEEE 44th European Solid State Circuits Conference (ESSCIRC) Pub Date : 2018-09-01 DOI:10.1109/ESSCIRC.2018.8494327

Chixiao Chen, Xindi Liu, Huwan Peng, Hongwei Ding, C. R. Shi

引用次数: 8

Abstract

The paper presents iFPNA, instruction-and-fabric programmable neuron array: a general-purpose deep learning accelerator that achieves both energy efficiency and flexibility. The iFPNA has a programmable data flow engine with a custom instruction set, and 16 configurable neuron slices for parallel neuron operations of different bit-widths. Convolutional neural networks of different kernel sizes are implemented by choosing data flows among input stationary, row stationary and tunnel stationary, etc. Recurrent neural networks with element-wise operations are implemented by a universal activation engine. Measurement results show that the iFPNA achieves a peak energy efficiency of 1.72 TOPS/W running at 30 MHz clock rate and 0.63 V voltage supply. The measured latency on AlexNet is 60.8 ms and on LSTM-512 is 40 ms at 125 MHz clock rate.

查看原文本刊更多论文

iFPNA:一种灵活高效的深度神经网络加速器，具有28纳米CMOS的可编程数据流引擎

iFPNA具有可编程的数据流引擎和自定义指令集，以及16个可配置的神经元切片，用于不同位宽的并行神经元操作。通过选择输入平稳、行平稳和隧道平稳等数据流，实现了不同核大小的卷积神经网络。具有元素操作的递归神经网络由通用激活引擎实现。测试结果表明，在时钟频率为30 MHz、电压为0.63 V时，iFPNA的峰值能效为1.72 TOPS/W。AlexNet的测量延迟为60.8 ms, LSTM-512在125 MHz时钟速率下为40 ms。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ESSCIRC 2018 - IEEE 44th European Solid State Circuits Conference (ESSCIRC)

自引率

0.00%

发文量