{"title":"RRAM-Based Neuromorphic Hardware Reliability Improvement by Self-Healing and Error Correction","authors":"Jiaping Hu, Kuan-Wei Hou, Chih-Yen Lo, Yung-Fa Chou, Cheng-Wen Wu","doi":"10.1109/ITC-ASIA.2018.00014","DOIUrl":null,"url":null,"abstract":"Neural network (NN) has been considered as an important factor for the success of many AI applications. As the von Neumann architecture is inefficient for NN computation, researchers have been investigating new semiconductor devices and architectures for neuromorphic computing. The crossbar RRAM, which is an emerging non-volatile memory composed of memristor devices, can be used to accelerate or emulate the NN computation. However, the memristor device defects exposed during manufacturing or field use may cause performance degradation in the NN, causing reliability issues to the neuromorphic hardware. In this paper, we consider two existing fault models for the 1T1R RRAM cell, i.e., the stuck-at fault and transistor stuck-on fault. Evaluation of their influence to the NN shows that for about 10% faulty cells in the memristor array, the accuracy for the MLP model degrades about 10%, and that for the LeNet 300-100 and LeNet 5 degrades by more than 65%. Therefore, we propose a self-healing and an error correction approach to reduce the accuracy degradation, and improve the reliability (lifetime) of the neuromorphic hardware. Our simulation results show that if we limit the accuracy degradation to within 5%, then the proposed error correction approach for the MLP model will be able to tolerate up to 40% faulty cells, and even up to 60% faulty cells for LeNet 300-100 and LetNet 5 models. Also, the error correction method can extend the lifetime of the neuromorphic hardware by 5% or more.","PeriodicalId":129553,"journal":{"name":"2018 IEEE International Test Conference in Asia (ITC-Asia)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Test Conference in Asia (ITC-Asia)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-ASIA.2018.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Neural network (NN) has been considered as an important factor for the success of many AI applications. As the von Neumann architecture is inefficient for NN computation, researchers have been investigating new semiconductor devices and architectures for neuromorphic computing. The crossbar RRAM, which is an emerging non-volatile memory composed of memristor devices, can be used to accelerate or emulate the NN computation. However, the memristor device defects exposed during manufacturing or field use may cause performance degradation in the NN, causing reliability issues to the neuromorphic hardware. In this paper, we consider two existing fault models for the 1T1R RRAM cell, i.e., the stuck-at fault and transistor stuck-on fault. Evaluation of their influence to the NN shows that for about 10% faulty cells in the memristor array, the accuracy for the MLP model degrades about 10%, and that for the LeNet 300-100 and LeNet 5 degrades by more than 65%. Therefore, we propose a self-healing and an error correction approach to reduce the accuracy degradation, and improve the reliability (lifetime) of the neuromorphic hardware. Our simulation results show that if we limit the accuracy degradation to within 5%, then the proposed error correction approach for the MLP model will be able to tolerate up to 40% faulty cells, and even up to 60% faulty cells for LeNet 300-100 and LetNet 5 models. Also, the error correction method can extend the lifetime of the neuromorphic hardware by 5% or more.