{"title":"Online Fault-Tolerance for Memristive Neuromorphic Fabric Based on Local Approximation","authors":"Soyed Tuhin Ahmed, R. Rakhmatullin, M. Tahoori","doi":"10.1109/ETS56758.2023.10174237","DOIUrl":null,"url":null,"abstract":"Neural networks (NNs) are a widely-used problem-solving tool, but their high computational and power consumption makes them expensive. Computation-in-Memory (CiM) architecture, which uses resistive non-volatile memories, is a promising solution due to its high energy efficiency. However, manufacturing defects and in-field faults can reduce the reliability and inference accuracy of CiM-implemented neural networks. Existing sophisticated fault detection and tolerance techniques require long downtime for testing and repair. In certain applications, e.g., \"always on\" NN applications, such downtime may not be acceptable. Thus, in this paper, a low-cost online fault tolerance technique based on local approximations is proposed to ensure continuous neural network operation with acceptable accuracy. Our approach reduces hardware overhead by up to 99.37% compared to conventional redundancy-based approaches while still achieving accuracy within 2% of the trained NNs.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"2204 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE European Test Symposium (ETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETS56758.2023.10174237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Neural networks (NNs) are a widely-used problem-solving tool, but their high computational and power consumption makes them expensive. Computation-in-Memory (CiM) architecture, which uses resistive non-volatile memories, is a promising solution due to its high energy efficiency. However, manufacturing defects and in-field faults can reduce the reliability and inference accuracy of CiM-implemented neural networks. Existing sophisticated fault detection and tolerance techniques require long downtime for testing and repair. In certain applications, e.g., "always on" NN applications, such downtime may not be acceptable. Thus, in this paper, a low-cost online fault tolerance technique based on local approximations is proposed to ensure continuous neural network operation with acceptable accuracy. Our approach reduces hardware overhead by up to 99.37% compared to conventional redundancy-based approaches while still achieving accuracy within 2% of the trained NNs.