A Self-Test Framework for Detecting Fault-induced Accuracy Drop in Neural Network Accelerators

Fanruo Meng, Fateme S. Hosseini, Chengmo Yang
{"title":"A Self-Test Framework for Detecting Fault-induced Accuracy Drop in Neural Network Accelerators","authors":"Fanruo Meng, Fateme S. Hosseini, Chengmo Yang","doi":"10.1145/3394885.3431519","DOIUrl":null,"url":null,"abstract":"Hardware accelerators built with SRAM or emerging memory devices are essential to the accommodation of the ever-increasing Deep Neural Network (DNN) workloads on resource-constrained devices. After deployment, however, the performance of these accelerators is threatened by the faults in their on-chip and off-chip memories where millions of DNN weights are held. Different types of faults may exist depending on the underlying memory technology, degrading inference accuracy. To tackle this challenge, this paper proposes an online self-test framework that monitors the accuracy of the accelerator with a small set of test images selected from the test dataset. Upon detecting a noticeable level of accuracy drop, the framework uses additional test images to identify the corresponding fault type and predict the severeness of faults by analyzing the change in the ranking of the test images. Experimental results show that our method can quickly detect the fault status of a DNN accelerator and provide accurate fault type and fault severeness information, allowing for subsequent recovery and self-healing process.","PeriodicalId":186307,"journal":{"name":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3394885.3431519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Hardware accelerators built with SRAM or emerging memory devices are essential to the accommodation of the ever-increasing Deep Neural Network (DNN) workloads on resource-constrained devices. After deployment, however, the performance of these accelerators is threatened by the faults in their on-chip and off-chip memories where millions of DNN weights are held. Different types of faults may exist depending on the underlying memory technology, degrading inference accuracy. To tackle this challenge, this paper proposes an online self-test framework that monitors the accuracy of the accelerator with a small set of test images selected from the test dataset. Upon detecting a noticeable level of accuracy drop, the framework uses additional test images to identify the corresponding fault type and predict the severeness of faults by analyzing the change in the ranking of the test images. Experimental results show that our method can quickly detect the fault status of a DNN accelerator and provide accurate fault type and fault severeness information, allowing for subsequent recovery and self-healing process.
一种检测神经网络加速器故障导致精度下降的自检框架
使用SRAM或新兴存储设备构建的硬件加速器对于在资源受限的设备上适应不断增加的深度神经网络(DNN)工作负载至关重要。然而,在部署之后,这些加速器的性能受到其片内和片外存储器中的故障的威胁,其中保存了数百万个DNN权重。根据底层内存技术的不同,可能存在不同类型的错误,从而降低推理的准确性。为了解决这一挑战,本文提出了一个在线自测框架,该框架使用从测试数据集中选择的一小组测试图像来监控加速器的准确性。当检测到准确率明显下降时,该框架使用额外的测试图像来识别相应的故障类型,并通过分析测试图像排名的变化来预测故障的严重程度。实验结果表明,该方法可以快速检测出DNN加速器的故障状态,并提供准确的故障类型和故障严重程度信息,从而允许后续的恢复和自愈过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信