Characterizing and Improving Resilience of Accelerators to Memory Errors in Autonomous Robots

IF 2 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Deval Shah, Zi Yu Xue, Karthik Pattabiraman, Tor M. Aamodt
{"title":"Characterizing and Improving Resilience of Accelerators to Memory Errors in Autonomous Robots","authors":"Deval Shah, Zi Yu Xue, Karthik Pattabiraman, Tor M. Aamodt","doi":"10.1145/3627828","DOIUrl":null,"url":null,"abstract":"Motion planning is a computationally intensive and well-studied problem in autonomous robots. However, motion planning hardware accelerators (MPA) must be soft-error resilient for deployment in safety-critical applications, and blanket application of traditional mitigation techniques is ill-suited due to cost, power, and performance overheads. We propose Collision Exposure Factor (CEF), a novel metric to assess the failure vulnerability of circuits processing spatial relationships, including motion planning. CEF is based on the insight that the safety violation probability increases with the surface area of the physical space exposed by a bit-flip. We evaluate CEF on four MPAs. We demonstrate empirically that CEF is correlated with safety violation probability, and that CEF-aware selective error mitigation provides 12.3 ×, 9.6 ×, and 4.2 × lower dangerous Failures-In-Time rate on average for the same amount of protected memory compared to uniform, bit-position, and access-frequency-aware selection of critical data. Furthermore, we show how to employ CEF to enable fault characterization using 23, 000 × fewer fault injection (FI) experiments than exhaustive FI, and evaluate our FI approach on different robots and MPAs. We demonstrate that CEF-aware FI can provide insights on vulnerable bits in an MPA while taking the same amount of time as uniform statistical FI. Finally, we use the CEF to formulate guidelines for designing soft-error resilient MPAs.","PeriodicalId":7055,"journal":{"name":"ACM Transactions on Cyber-Physical Systems","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3627828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Motion planning is a computationally intensive and well-studied problem in autonomous robots. However, motion planning hardware accelerators (MPA) must be soft-error resilient for deployment in safety-critical applications, and blanket application of traditional mitigation techniques is ill-suited due to cost, power, and performance overheads. We propose Collision Exposure Factor (CEF), a novel metric to assess the failure vulnerability of circuits processing spatial relationships, including motion planning. CEF is based on the insight that the safety violation probability increases with the surface area of the physical space exposed by a bit-flip. We evaluate CEF on four MPAs. We demonstrate empirically that CEF is correlated with safety violation probability, and that CEF-aware selective error mitigation provides 12.3 ×, 9.6 ×, and 4.2 × lower dangerous Failures-In-Time rate on average for the same amount of protected memory compared to uniform, bit-position, and access-frequency-aware selection of critical data. Furthermore, we show how to employ CEF to enable fault characterization using 23, 000 × fewer fault injection (FI) experiments than exhaustive FI, and evaluate our FI approach on different robots and MPAs. We demonstrate that CEF-aware FI can provide insights on vulnerable bits in an MPA while taking the same amount of time as uniform statistical FI. Finally, we use the CEF to formulate guidelines for designing soft-error resilient MPAs.
自主机器人加速器对记忆错误的表征与改进
在自主机器人中,运动规划是一个计算量大且研究深入的问题。然而,运动规划硬件加速器(MPA)必须具有软错误弹性,才能在安全关键应用中部署,而由于成本、功率和性能开销,传统缓解技术的一揽子应用并不适合。我们提出了碰撞暴露因子(CEF),这是一种评估电路处理空间关系(包括运动规划)的失效脆弱性的新度量。CEF是基于这样一种认识,即安全违规概率随着比特翻转所暴露的物理空间表面积的增加而增加。我们评估了四个海洋保护区的CEF。我们从经验上证明了CEF与安全违反概率相关,并且与统一、位位置和访问频率感知的关键数据选择相比,对于相同数量的受保护内存,CEF感知的选择性错误缓解平均降低了12.3倍、9.6倍和4.2倍的危险及时故障率。此外,我们展示了如何使用CEF来实现故障表征,使用的故障注入(FI)实验比穷举FI少23000倍,并在不同的机器人和MPAs上评估了我们的FI方法。我们证明,cef感知的FI可以提供MPA中脆弱钻头的见解,同时花费与统一统计FI相同的时间。最后,我们使用CEF来制定设计软误差弹性mpa的指导方针。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Cyber-Physical Systems
ACM Transactions on Cyber-Physical Systems COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
5.70
自引率
4.30%
发文量
40
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信