Power supply induced common cause faults-experimental assessment of potential countermeasures

Peter Tummeltshammer, A. Steininger
{"title":"Power supply induced common cause faults-experimental assessment of potential countermeasures","authors":"Peter Tummeltshammer, A. Steininger","doi":"10.1109/DSN.2009.5270308","DOIUrl":null,"url":null,"abstract":"Fault-tolerant architectures based on physical replication of components are vulnerable to faults that cause the same effect in all replica. Short outages in a power supply shared by all replica are a prominent example for such common cause faults. For systems in which the provision of a replicated power supply would cause prohibitive efforts the identification of reliable countermeasures against these effects is vital to maintain the required dependability level. In this paper we propose several of such countermeasures, namely parity protection, voltage monitoring and time diversity of the replica. We perform extensive fault injection experiments on three fault-tolerant dual core processor designs, one FPGA based and two commercial ASICs. These experiments provide evidence for the vulnerability of a completely unprotected dual core solution, while time diversity and voltage monitoring in combination with increased timing margins turn out particularly effective for eliminating common cause effects.","PeriodicalId":376982,"journal":{"name":"2009 IEEE/IFIP International Conference on Dependable Systems & Networks","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE/IFIP International Conference on Dependable Systems & Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2009.5270308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Fault-tolerant architectures based on physical replication of components are vulnerable to faults that cause the same effect in all replica. Short outages in a power supply shared by all replica are a prominent example for such common cause faults. For systems in which the provision of a replicated power supply would cause prohibitive efforts the identification of reliable countermeasures against these effects is vital to maintain the required dependability level. In this paper we propose several of such countermeasures, namely parity protection, voltage monitoring and time diversity of the replica. We perform extensive fault injection experiments on three fault-tolerant dual core processor designs, one FPGA based and two commercial ASICs. These experiments provide evidence for the vulnerability of a completely unprotected dual core solution, while time diversity and voltage monitoring in combination with increased timing margins turn out particularly effective for eliminating common cause effects.
电源共因故障——潜在对策的实验评估
基于组件物理复制的容错架构容易受到在所有副本中导致相同影响的错误的影响。所有副本共享电源的短暂中断是此类常见原因故障的一个突出例子。对于提供重复电源会造成禁止努力的系统,确定可靠的对抗这些影响的措施对于维持所需的可靠性水平至关重要。在本文中,我们提出了几种这样的对策,即奇偶保护、电压监测和副本时分集。我们对三种容错双核处理器设计,一种基于FPGA和两种商用asic进行了广泛的故障注入实验。这些实验为完全不受保护的双核解决方案的脆弱性提供了证据,而时间分集和电压监测与增加的时间裕度相结合,对于消除共同原因影响特别有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信