Accelerated microarchitectural Fault Injection-based reliability assessment

2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS) Pub Date : 2015-11-09 DOI:10.1109/DFT.2015.7315134

Manolis Kaliorakis, Sotiris Tselonis, Athanasios Chatzidimitriou, D. Gizopoulos

{"title":"Accelerated microarchitectural Fault Injection-based reliability assessment","authors":"Manolis Kaliorakis, Sotiris Tselonis, Athanasios Chatzidimitriou, D. Gizopoulos","doi":"10.1109/DFT.2015.7315134","DOIUrl":null,"url":null,"abstract":"Statistical Fault Injection on microarchitectural simulators can provide early and accurate reliability characterization for array based hardware components. Besides, microarchitectural fault injectors are easily configurable (facilitating many reliability studies) and orders of magnitude faster than RTL fault injectors, rendering them appropriate tools for early reliability estimation using large and realistic benchmarks. However, the throughput of the fault injection campaigns on microarchitectural simulators remains a bottleneck when a batch of campaigns must run for early reliability estimation of a processor (different microarchitectural characteristics, different workloads). This paper presents two different operation modes on top of a baseline framework for statistical fault injection campaigns, trading off between accuracy and speedup of the injection campaigns with a state-of-the-art out-of-order full-system ×86-64 simulator as experimental vehicle. In the first mode, the injection experiments are stopped and classified as masked due to the following conditions: (i) the fault is over-written after the injection and it hasn't been read earlier, (ii) or the fault is injected on an invalid entry. The second mode has the same termination conditions as the first mode, but the injection experiments can also be terminated when an instruction that has read the faulty entry passes through the commit stage of the ×86-64 out-of-order architecture. In the first mode, we observed a speedup up to 2.92× with no loss of accuracy in the vulnerability measurements for all structures. In the second mode an even higher speedup of up to 4.06× has been obtained with small loss in the accuracy of the vulnerability measurements.","PeriodicalId":383972,"journal":{"name":"2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFT.2015.7315134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Statistical Fault Injection on microarchitectural simulators can provide early and accurate reliability characterization for array based hardware components. Besides, microarchitectural fault injectors are easily configurable (facilitating many reliability studies) and orders of magnitude faster than RTL fault injectors, rendering them appropriate tools for early reliability estimation using large and realistic benchmarks. However, the throughput of the fault injection campaigns on microarchitectural simulators remains a bottleneck when a batch of campaigns must run for early reliability estimation of a processor (different microarchitectural characteristics, different workloads). This paper presents two different operation modes on top of a baseline framework for statistical fault injection campaigns, trading off between accuracy and speedup of the injection campaigns with a state-of-the-art out-of-order full-system ×86-64 simulator as experimental vehicle. In the first mode, the injection experiments are stopped and classified as masked due to the following conditions: (i) the fault is over-written after the injection and it hasn't been read earlier, (ii) or the fault is injected on an invalid entry. The second mode has the same termination conditions as the first mode, but the injection experiments can also be terminated when an instruction that has read the faulty entry passes through the commit stage of the ×86-64 out-of-order architecture. In the first mode, we observed a speedup up to 2.92× with no loss of accuracy in the vulnerability measurements for all structures. In the second mode an even higher speedup of up to 4.06× has been obtained with small loss in the accuracy of the vulnerability measurements.

查看原文本刊更多论文

基于故障注入的微架构可靠性加速评估

基于统计故障注入的微架构模拟器可以为基于阵列的硬件组件提供早期、准确的可靠性表征。此外，微架构故障注入器易于配置(促进了许多可靠性研究)，并且比RTL故障注入器快几个数量级，使它们成为使用大型和现实基准进行早期可靠性评估的合适工具。然而，在微架构模拟器上的故障注入活动的吞吐量仍然是一个瓶颈，因为必须运行一批活动来进行处理器的早期可靠性评估(不同的微架构特征，不同的工作负载)。本文在统计故障注入活动的基线框架上提出了两种不同的操作模式，使用最先进的失序全系统×86-64模拟器作为实验工具，在注入活动的准确性和加速之间进行权衡。在第一种模式下，由于以下情况，注入实验停止并被分类为屏蔽:(i)注入后故障被覆盖，并且之前没有读取故障，(ii)或者故障被注入到无效条目上。第二种模式与第一种模式具有相同的终止条件，但是当读取错误条目的指令通过×86-64乱序架构的提交阶段时，注入实验也可以终止。在第一种模式下，我们观察到所有结构的易损性测量的加速高达2.92倍，而精度没有损失。在第二种模式下，在漏洞测量精度损失较小的情况下，获得了更高的加速，最高可达4.06×。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)

自引率

0.00%

发文量