Accelerated Local Training of CNNs by Optimized Direct Feedback Alignment Based on Stochasticity of 4 Mb C-doped Ge2Sb2Te5 PCM Chip in 40 nm Node

2020 IEEE International Electron Devices Meeting (IEDM) Pub Date : 2020-12-12 DOI:10.1109/IEDM13553.2020.9371910

Yingming Lu, Xi Li, Longhao Yan, Teng Zhang, Yuchao Yang, Zhitang Song, Ru Huang

引用次数: 10

Abstract

On-chip local training is highly desirable for the application of deep neural networks in environment-adaptive edge platforms, which however is hindered by the high time and energy costs of training. Here, we demonstrate efficient training of VGG-16 and LeNet-5 by optimized direct feedback alignment that replaces the layer-by-layer back propagation (BP) of errors. For the first time, the inherent stochasticity in phase change memory fabricated in 40 nm node is exploited to build a merged random feedback matrix with reduced hardware cost. Due to the physical generation of merged matrix and in-memory error computing as well as proposed conductance drift (CD) compensation protocols, the training time and energy consumptions of VGG-16 are reduced by 3× and 3.3×, respectively, compared with hardware-accelerated in-memory BP training, with 90% accuracy on CIFAR-10.

查看原文本刊更多论文

基于40nm节点4mb c掺杂Ge2Sb2Te5 PCM芯片随机性的优化直接反馈对准加速cnn局部训练

片上局部训练是深度神经网络应用于环境自适应边缘平台的迫切需要，但训练的高时间和能量成本阻碍了局部训练的实现。在这里，我们展示了VGG-16和LeNet-5通过优化的直接反馈对齐来取代逐层反向传播(BP)误差的有效训练。本文首次利用40 nm节点相变存储器固有的随机性，构建了一个合并随机反馈矩阵，降低了硬件成本。由于合并矩阵的物理生成和内存误差计算以及提出的电导漂移(CD)补偿协议，与硬件加速的内存BP训练相比，VGG-16的训练时间和能量消耗分别减少了3倍和3.3倍，在CIFAR-10上的准确率达到90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Electron Devices Meeting (IEDM)

自引率

0.00%

发文量