RRAM-Based Neuromorphic Hardware Reliability Improvement by Self-Healing and Error Correction

2018 IEEE International Test Conference in Asia (ITC-Asia) Pub Date : 2018-08-01 DOI:10.1109/ITC-ASIA.2018.00014

Jiaping Hu, Kuan-Wei Hou, Chih-Yen Lo, Yung-Fa Chou, Cheng-Wen Wu

{"title":"RRAM-Based Neuromorphic Hardware Reliability Improvement by Self-Healing and Error Correction","authors":"Jiaping Hu, Kuan-Wei Hou, Chih-Yen Lo, Yung-Fa Chou, Cheng-Wen Wu","doi":"10.1109/ITC-ASIA.2018.00014","DOIUrl":null,"url":null,"abstract":"Neural network (NN) has been considered as an important factor for the success of many AI applications. As the von Neumann architecture is inefficient for NN computation, researchers have been investigating new semiconductor devices and architectures for neuromorphic computing. The crossbar RRAM, which is an emerging non-volatile memory composed of memristor devices, can be used to accelerate or emulate the NN computation. However, the memristor device defects exposed during manufacturing or field use may cause performance degradation in the NN, causing reliability issues to the neuromorphic hardware. In this paper, we consider two existing fault models for the 1T1R RRAM cell, i.e., the stuck-at fault and transistor stuck-on fault. Evaluation of their influence to the NN shows that for about 10% faulty cells in the memristor array, the accuracy for the MLP model degrades about 10%, and that for the LeNet 300-100 and LeNet 5 degrades by more than 65%. Therefore, we propose a self-healing and an error correction approach to reduce the accuracy degradation, and improve the reliability (lifetime) of the neuromorphic hardware. Our simulation results show that if we limit the accuracy degradation to within 5%, then the proposed error correction approach for the MLP model will be able to tolerate up to 40% faulty cells, and even up to 60% faulty cells for LeNet 300-100 and LetNet 5 models. Also, the error correction method can extend the lifetime of the neuromorphic hardware by 5% or more.","PeriodicalId":129553,"journal":{"name":"2018 IEEE International Test Conference in Asia (ITC-Asia)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Test Conference in Asia (ITC-Asia)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-ASIA.2018.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Neural network (NN) has been considered as an important factor for the success of many AI applications. As the von Neumann architecture is inefficient for NN computation, researchers have been investigating new semiconductor devices and architectures for neuromorphic computing. The crossbar RRAM, which is an emerging non-volatile memory composed of memristor devices, can be used to accelerate or emulate the NN computation. However, the memristor device defects exposed during manufacturing or field use may cause performance degradation in the NN, causing reliability issues to the neuromorphic hardware. In this paper, we consider two existing fault models for the 1T1R RRAM cell, i.e., the stuck-at fault and transistor stuck-on fault. Evaluation of their influence to the NN shows that for about 10% faulty cells in the memristor array, the accuracy for the MLP model degrades about 10%, and that for the LeNet 300-100 and LeNet 5 degrades by more than 65%. Therefore, we propose a self-healing and an error correction approach to reduce the accuracy degradation, and improve the reliability (lifetime) of the neuromorphic hardware. Our simulation results show that if we limit the accuracy degradation to within 5%, then the proposed error correction approach for the MLP model will be able to tolerate up to 40% faulty cells, and even up to 60% faulty cells for LeNet 300-100 and LetNet 5 models. Also, the error correction method can extend the lifetime of the neuromorphic hardware by 5% or more.

查看原文本刊更多论文

基于rram的神经形态硬件的自修复和纠错提高可靠性

神经网络(NN)已被认为是许多人工智能应用成功的重要因素。由于冯·诺伊曼架构对于神经网络计算效率低下，研究人员一直在研究新的半导体器件和神经形态计算架构。横杆随机存储器是一种新兴的由忆阻器组成的非易失性存储器，可用于加速或模拟神经网络的计算。然而，在制造或现场使用过程中暴露的忆阻器器件缺陷可能会导致神经网络的性能下降，从而导致神经形态硬件的可靠性问题。本文考虑了1T1R随机存储器单元的两种现有故障模型，即卡死故障和晶体管卡死故障。对它们对神经网络影响的评估表明，对于记忆电阻阵列中约10%的故障单元，MLP模型的精度下降约10%，而LeNet 300-100和LeNet 5的精度下降超过65%。因此，我们提出了一种自愈和纠错方法来减少精度下降，并提高神经形态硬件的可靠性(寿命)。我们的仿真结果表明，如果我们将精度下降限制在5%以内，那么所提出的MLP模型的纠错方法将能够容忍高达40%的错误单元，甚至高达60%的错误单元对于LeNet 300-100和LetNet 5模型。此外，该误差校正方法可使神经形态硬件的寿命延长5%或更多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE International Test Conference in Asia (ITC-Asia)

自引率

0.00%

发文量