Assessing the impact of hard faults in performance components of modern microprocessors

N. Foutris, D. Gizopoulos, J. Kalamatianos, Vilas Sridharan
{"title":"Assessing the impact of hard faults in performance components of modern microprocessors","authors":"N. Foutris, D. Gizopoulos, J. Kalamatianos, Vilas Sridharan","doi":"10.1109/ICCD.2013.6657044","DOIUrl":null,"url":null,"abstract":"A growing portion of the silicon area of modern high-performance microprocessors is dedicated to components that increase performance but do not determine functional correctness. Permanent hardware faults in these components can lead to performance fluctuation (not necessarily degradation) and do not produce functional errors. Although this fact has been identified previously, extensive research has not yet been conducted to accurately classify and quantify permanent faults in these components over a set of CPU benchmarks or measure the magnitude of the performance impact. Depending on the results of such studies, performance-related components of microprocessors can be disabled in fine or coarse granularities, salvaging microprocessor functionality at different performance levels. This paper analyzes the impact of permanent faults in the arrays and control logic of key microprocessor performance components such as the branch predictor, branch target buffer, return address stack, and data and instruction prefetchers. We apply a statistically safe fault injection campaign for single faults in performance components on a modified version of the cycle-accurate x86 architectural simulator PTLsim running the SPEC CPU2006 suite. Our evaluation reveals significant differences in the effect of faults and their performance impacts across the components as well as within each component (different fields). We classify faults for all components and analyze their IPC impact in the arrays and control logic. Our analysis shows that a very large fraction (44% to 96%) of permanent faults in these components leads only to performance fluctuation. Observation confirms the intuition that there are no functionality errors; however, many cases of a single fault in a performance component can significantly degrade microprocessor performance (2-20%average IPC reduction for SPEC CPU2006).","PeriodicalId":398811,"journal":{"name":"2013 IEEE 31st International Conference on Computer Design (ICCD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 31st International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2013.6657044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

A growing portion of the silicon area of modern high-performance microprocessors is dedicated to components that increase performance but do not determine functional correctness. Permanent hardware faults in these components can lead to performance fluctuation (not necessarily degradation) and do not produce functional errors. Although this fact has been identified previously, extensive research has not yet been conducted to accurately classify and quantify permanent faults in these components over a set of CPU benchmarks or measure the magnitude of the performance impact. Depending on the results of such studies, performance-related components of microprocessors can be disabled in fine or coarse granularities, salvaging microprocessor functionality at different performance levels. This paper analyzes the impact of permanent faults in the arrays and control logic of key microprocessor performance components such as the branch predictor, branch target buffer, return address stack, and data and instruction prefetchers. We apply a statistically safe fault injection campaign for single faults in performance components on a modified version of the cycle-accurate x86 architectural simulator PTLsim running the SPEC CPU2006 suite. Our evaluation reveals significant differences in the effect of faults and their performance impacts across the components as well as within each component (different fields). We classify faults for all components and analyze their IPC impact in the arrays and control logic. Our analysis shows that a very large fraction (44% to 96%) of permanent faults in these components leads only to performance fluctuation. Observation confirms the intuition that there are no functionality errors; however, many cases of a single fault in a performance component can significantly degrade microprocessor performance (2-20%average IPC reduction for SPEC CPU2006).
评估硬故障对现代微处理器性能组件的影响
现代高性能微处理器中越来越多的硅片用于提高性能但不确定功能正确性的组件。这些组件的永久性硬件故障可能导致性能波动(不一定会降低),但不会产生功能错误。尽管之前已经发现了这一事实,但是还没有进行广泛的研究来准确地分类和量化这些组件在一组CPU基准测试中的永久故障,或者衡量性能影响的程度。根据这些研究的结果,微处理器的性能相关组件可以在细粒度或粗粒度上禁用,从而在不同的性能水平上挽救微处理器的功能。分析了支路预测器、支路目标缓冲区、返回地址堆栈、数据和指令预取器等微处理器关键性能部件的阵列和控制逻辑中永久性故障的影响。我们在运行SPEC CPU2006套件的周期精确x86架构模拟器PTLsim的修改版本上应用统计安全故障注入活动来处理性能组件中的单个故障。我们的评估揭示了故障的影响及其对组件之间以及每个组件(不同领域)的性能影响的显着差异。我们对所有组件的故障进行分类,并分析其在阵列和控制逻辑中的IPC影响。我们的分析表明,这些部件中有很大一部分(44%至96%)的永久性故障只会导致性能波动。观察证实了没有功能性错误的直觉;然而,在许多情况下,单个故障的性能组件可以显著降低微处理器的性能(2-20%的IPC平均降低SPEC CPU2006)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信