Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

2021 22nd International Symposium on Quality Electronic Design (ISQED) Pub Date : 2021-04-07 DOI:10.1109/ISQED51717.2021.9424332

Geng Yuan, Zhiheng Liao, Xiaolong Ma, Yuxuan Cai, Zhenglun Kong, Xuan Shen, Jingyan Fu, Zhengang Li, Chengming Zhang, Hongwu Peng, Ning Liu, Ao Ren, Jinhui Wang, Yanzhi Wang

{"title":"Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI","authors":"Geng Yuan, Zhiheng Liao, Xiaolong Ma, Yuxuan Cai, Zhenglun Kong, Xuan Shen, Jingyan Fu, Zhengang Li, Chengming Zhang, Hongwu Peng, Ning Liu, Ao Ren, Jinhui Wang, Yanzhi Wang","doi":"10.1109/ISQED51717.2021.9424332","DOIUrl":null,"url":null,"abstract":"Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device.","PeriodicalId":123018,"journal":{"name":"2021 22nd International Symposium on Quality Electronic Design (ISQED)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 22nd International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED51717.2021.9424332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

Abstract

Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device.

查看原文本刊更多论文

基于rram边缘人工智能的权值剪枝和差分横杆映射提高DNN容错性

最近的研究表明，使用电阻随机存取存储器(ReRAM)作为一种新兴技术，有望执行内在并行模拟域原位矩阵向量乘法，这是深度神经网络(dnn)中密集和关键的计算。然而，硬件故障，如卡在故障缺陷，是阻碍ReRAM设备成为实际实现的可行解决方案的主要问题之一。解决这个问题的现有解决方案通常需要对每个单独的设备进行优化，这对于批量生产的产品(例如物联网设备)是不切实际的。本文从模型容错的角度重新思考了权值剪枝在基于reram的深度神经网络设计中的价值。提出了一种差分映射方案，提高了系统在高卡断率下的容错性。在具有代表性的深度神经网络任务中，我们的方法可以容忍比传统的双列方法高一个数量级的故障率。更重要的是，与传统的两列映射方案相比，我们的方法不需要额外的硬件成本。这种改进是普遍的，不需要对每个单独的设备进行优化过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 22nd International Symposium on Quality Electronic Design (ISQED)

自引率

0.00%

发文量