同上处理器

S. Lai, Shih-Lien Lu, J. Peir
{"title":"同上处理器","authors":"S. Lai, Shih-Lien Lu, J. Peir","doi":"10.1109/DSN.2002.1028947","DOIUrl":null,"url":null,"abstract":"Concentration of design effort for current single-chip commercial-off-the-shelf (COTS) microprocessors has been directed towards performance. Reliability has not been the primary focus. As supply voltage scales to accommodate technology scaling and to lower power consumption, transient errors are more likely to be introduced. The basic idea behind any error tolerance scheme involves some type of redundancy. Redundancy techniques can be categorized in three general categories: (1) hardware redundancy, (2) information redundancy, and (3) time redundancy. Existing time redundant techniques for improving reliability of a superscalar processor utilize the otherwise unused hardware resources as much as possible to hide the overhead of program re-execution and verification. However, our study reveals that re-executing of long latency operations contributes to performance loss. We suggest a method to handle short and long latency instructions in slightly different ways to reduce the performance degradation. Our goal is to minimize the hardware overhead and performance degradation while maximizing the fault detection coverage. Experimental studies through microarchitecture simulation are used to compare performance lost due to the proposed scheme with non-fault tolerant design and different existing time redundant fault tolerant schemes. Fourteen integer and floating-point benchmarks are simulated with 1.8/spl sim/13.3% performance loss when compared with non-fault-tolerant superscalar processor.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"33 2 1","pages":"525-534"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Ditto processor\",\"authors\":\"S. Lai, Shih-Lien Lu, J. Peir\",\"doi\":\"10.1109/DSN.2002.1028947\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Concentration of design effort for current single-chip commercial-off-the-shelf (COTS) microprocessors has been directed towards performance. Reliability has not been the primary focus. As supply voltage scales to accommodate technology scaling and to lower power consumption, transient errors are more likely to be introduced. The basic idea behind any error tolerance scheme involves some type of redundancy. Redundancy techniques can be categorized in three general categories: (1) hardware redundancy, (2) information redundancy, and (3) time redundancy. Existing time redundant techniques for improving reliability of a superscalar processor utilize the otherwise unused hardware resources as much as possible to hide the overhead of program re-execution and verification. However, our study reveals that re-executing of long latency operations contributes to performance loss. We suggest a method to handle short and long latency instructions in slightly different ways to reduce the performance degradation. Our goal is to minimize the hardware overhead and performance degradation while maximizing the fault detection coverage. Experimental studies through microarchitecture simulation are used to compare performance lost due to the proposed scheme with non-fault tolerant design and different existing time redundant fault tolerant schemes. Fourteen integer and floating-point benchmarks are simulated with 1.8/spl sim/13.3% performance loss when compared with non-fault-tolerant superscalar processor.\",\"PeriodicalId\":93807,\"journal\":{\"name\":\"Proceedings. International Conference on Dependable Systems and Networks\",\"volume\":\"33 2 1\",\"pages\":\"525-534\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Dependable Systems and Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSN.2002.1028947\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Dependable Systems and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2002.1028947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

当前单芯片商业现货(COTS)微处理器的设计工作集中在性能上。可靠性并不是主要的焦点。随着电源电压的缩放以适应技术缩放和降低功耗,更有可能引入瞬态误差。任何容错方案背后的基本思想都涉及某种类型的冗余。冗余技术可以分为三大类:(1)硬件冗余,(2)信息冗余,(3)时间冗余。现有的用于提高超标量处理器可靠性的时间冗余技术尽可能多地利用未使用的硬件资源来隐藏程序重新执行和验证的开销。然而,我们的研究表明,重新执行长延迟操作会导致性能损失。我们建议一种方法,以略微不同的方式处理短延迟和长延迟指令,以减少性能下降。我们的目标是最小化硬件开销和性能下降,同时最大化故障检测覆盖率。通过微体系结构仿真的实验研究,比较了非容错方案与现有不同时间冗余容错方案所造成的性能损失。与非容错超标量处理器相比,模拟了14个整数和浮点基准,性能损失为1.8/spl sim/13.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ditto processor
Concentration of design effort for current single-chip commercial-off-the-shelf (COTS) microprocessors has been directed towards performance. Reliability has not been the primary focus. As supply voltage scales to accommodate technology scaling and to lower power consumption, transient errors are more likely to be introduced. The basic idea behind any error tolerance scheme involves some type of redundancy. Redundancy techniques can be categorized in three general categories: (1) hardware redundancy, (2) information redundancy, and (3) time redundancy. Existing time redundant techniques for improving reliability of a superscalar processor utilize the otherwise unused hardware resources as much as possible to hide the overhead of program re-execution and verification. However, our study reveals that re-executing of long latency operations contributes to performance loss. We suggest a method to handle short and long latency instructions in slightly different ways to reduce the performance degradation. Our goal is to minimize the hardware overhead and performance degradation while maximizing the fault detection coverage. Experimental studies through microarchitecture simulation are used to compare performance lost due to the proposed scheme with non-fault tolerant design and different existing time redundant fault tolerant schemes. Fourteen integer and floating-point benchmarks are simulated with 1.8/spl sim/13.3% performance loss when compared with non-fault-tolerant superscalar processor.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信