SafeDM: a Hardware Diversity Monitor for Redundant Execution on Non-Lockstepped Cores

F. Bas, Pedro Benedicte, S. Alcaide, Guillem Cabo, Fabio Mazzocchetti, J. Abella
{"title":"SafeDM: a Hardware Diversity Monitor for Redundant Execution on Non-Lockstepped Cores","authors":"F. Bas, Pedro Benedicte, S. Alcaide, Guillem Cabo, Fabio Mazzocchetti, J. Abella","doi":"10.23919/DATE54114.2022.9774540","DOIUrl":null,"url":null,"abstract":"Computing systems in the safety domain, such as those in avionics or space, require specific safety measures related to the criticality of the deployment. A problem these systems face is that of transient failures in hardware. A solution commonly used to tackle potential failures is to introduce redundancy in these systems, for example 2 cores that execute the same program at the same time. However, redundancy does not solve all potential failures, such as Common Cause Failures (CCF), where a single fault affects both cores identically (e.g. a voltage droop). If both redundant cores have identical state when the fault occurs, then there may be a CCF since the fault can affect both cores in the same way. To avoid CCF it is critical to know that there is diversity in the execution amongst the redundant cores. In this paper we introduce SafeDM, a hardware Diversity Monitor that quantifies the diversity of each redundant processor to guarantee that CCF will not go unnoticed, and without needing to deploy lockstepped cores. SafeDM computes data and instruction diversity separately, using different techniques appropriate for each case. We integrate SafeDM in a RISC-V FPGA space MPSoC from Cobham Gaisler where SafeDM is proven effective with a large benchmark suite, incurring low area and power overheads. Overall, SafeDM is an effective hardware solution to quantify diversity in cores performing redundant execution.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DATE54114.2022.9774540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Computing systems in the safety domain, such as those in avionics or space, require specific safety measures related to the criticality of the deployment. A problem these systems face is that of transient failures in hardware. A solution commonly used to tackle potential failures is to introduce redundancy in these systems, for example 2 cores that execute the same program at the same time. However, redundancy does not solve all potential failures, such as Common Cause Failures (CCF), where a single fault affects both cores identically (e.g. a voltage droop). If both redundant cores have identical state when the fault occurs, then there may be a CCF since the fault can affect both cores in the same way. To avoid CCF it is critical to know that there is diversity in the execution amongst the redundant cores. In this paper we introduce SafeDM, a hardware Diversity Monitor that quantifies the diversity of each redundant processor to guarantee that CCF will not go unnoticed, and without needing to deploy lockstepped cores. SafeDM computes data and instruction diversity separately, using different techniques appropriate for each case. We integrate SafeDM in a RISC-V FPGA space MPSoC from Cobham Gaisler where SafeDM is proven effective with a large benchmark suite, incurring low area and power overheads. Overall, SafeDM is an effective hardware solution to quantify diversity in cores performing redundant execution.
SafeDM:一种在非锁步核上冗余执行的硬件分集监视器
安全领域的计算系统,如航空电子或太空中的计算系统,需要与部署的重要性相关的特定安全措施。这些系统面临的一个问题是硬件中的瞬态故障。通常用于解决潜在故障的解决方案是在这些系统中引入冗余,例如同时执行相同程序的2个核心。然而,冗余并不能解决所有潜在的故障,例如共同原因故障(CCF),其中单个故障同时影响两个核心(例如电压下降)。如果两个冗余核心在故障发生时具有相同的状态,那么可能存在CCF,因为故障可以以相同的方式影响两个核心。为了避免CCF,关键是要知道冗余核心之间的执行存在多样性。在本文中,我们介绍了SafeDM,一个硬件分集监视器,量化每个冗余处理器的分集,以保证CCF不会被忽视,而不需要部署锁步内核。SafeDM分别计算数据和指令多样性,使用适合每种情况的不同技术。我们将SafeDM集成到Cobham Gaisler的RISC-V FPGA空间MPSoC中,SafeDM在大型基准测试套件中被证明是有效的,产生了低面积和功耗开销。总的来说,SafeDM是一种有效的硬件解决方案,可以量化执行冗余执行的核心的多样性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信