An Efficient Soft Error Detection in Multicore Processors Running Server Applications

A. Tajary, H. Zarandi
{"title":"An Efficient Soft Error Detection in Multicore Processors Running Server Applications","authors":"A. Tajary, H. Zarandi","doi":"10.1109/PDP.2016.100","DOIUrl":null,"url":null,"abstract":"In this paper, a throughput-aware transient fault detection method is presented with respect to the features of server processors. The proposed method takes the advantages of combination of reconfigurable redundant execution-based fault detection and speculative fault detection. The reconfigurable redundant execution-based fault detection method by using configuration manager module couples two free adjacent cores on which a thread will be executed, and decouples them when resources are limited for normal execution. This method exploits unused resources in the multi-core processors to ensure high throughput reliable execution. The speculative fault detection method uses a history of block addresses requested form L1 cache to L2 cache during thread execution to find abnormal execution behavior. In order to evaluate the proposed method, the alpha processor model is utilized in the context of Gem5 simulator. The experimental results showed that 70% of injected faults can be detected with negligible hardware overhead.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, a throughput-aware transient fault detection method is presented with respect to the features of server processors. The proposed method takes the advantages of combination of reconfigurable redundant execution-based fault detection and speculative fault detection. The reconfigurable redundant execution-based fault detection method by using configuration manager module couples two free adjacent cores on which a thread will be executed, and decouples them when resources are limited for normal execution. This method exploits unused resources in the multi-core processors to ensure high throughput reliable execution. The speculative fault detection method uses a history of block addresses requested form L1 cache to L2 cache during thread execution to find abnormal execution behavior. In order to evaluate the proposed method, the alpha processor model is utilized in the context of Gem5 simulator. The experimental results showed that 70% of injected faults can be detected with negligible hardware overhead.
多核处理器运行服务器应用程序的有效软错误检测
本文针对服务器处理器的特点,提出了一种吞吐量感知的暂态故障检测方法。该方法将基于可重构冗余执行的故障检测与推测性故障检测相结合。基于可重构冗余执行的故障检测方法利用配置管理器模块对两个空闲的相邻核进行耦合,并在资源有限的情况下进行解耦。该方法利用多核处理器中未使用的资源来保证高吞吐量和可靠的执行。推测性故障检测方法使用线程执行期间从L1缓存到L2缓存请求的块地址历史记录来查找异常的执行行为。为了对所提出的方法进行评估,在Gem5仿真环境中使用了alpha处理器模型。实验结果表明,70%的注入故障可以被检测到,而硬件开销可以忽略不计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信