{"title":"技术规模对寿命可靠性的影响","authors":"Jayanth Srinivasan, S. Adve, P. Bose, J. Rivers","doi":"10.1109/DSN.2004.1311888","DOIUrl":null,"url":null,"abstract":"The relentless scaling of CMOS technology has provided a steady increase in processor performance for the past three decades. However, increased power densities (hence temperatures) and other scaling effects have an adverse impact on long-term processor lifetime reliability. This paper represents a first attempt at quantifying the impact of scaling on lifetime reliability due to intrinsic hard errors, taking workload characteristics into consideration. For our quantitative evaluation, we use RAMP (Srinivasan et al., 2004), a previously proposed industrial-strength model that provides reliability estimates for a workload, but for a given technology. We extend RAMP by adding scaling specific parameters to enable workload-dependent lifetime reliability evaluation at different technologies. We show that (1) scaling has a significant impact on processor hard failure rates - on average, with SPEC benchmarks, we find the failure rate of a scaled 65nm processor to be 316% higher than a similarly pipelined 180nm processor; (2) time-dependent dielectric breakdown and electromigration have the largest increases; and (3) with scaling, the difference in reliability from running at worst-case vs. typical workload operating conditions increases significantly, as does the difference from running different workloads. Our results imply that leveraging a single microarchitecture design for multiple remaps across a few technology generations will become increasingly difficult, and motivate a need for workload specific, microarchitectural lifetime reliability awareness at an early design stage.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"125 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"611","resultStr":"{\"title\":\"The impact of technology scaling on lifetime reliability\",\"authors\":\"Jayanth Srinivasan, S. Adve, P. Bose, J. Rivers\",\"doi\":\"10.1109/DSN.2004.1311888\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The relentless scaling of CMOS technology has provided a steady increase in processor performance for the past three decades. However, increased power densities (hence temperatures) and other scaling effects have an adverse impact on long-term processor lifetime reliability. This paper represents a first attempt at quantifying the impact of scaling on lifetime reliability due to intrinsic hard errors, taking workload characteristics into consideration. For our quantitative evaluation, we use RAMP (Srinivasan et al., 2004), a previously proposed industrial-strength model that provides reliability estimates for a workload, but for a given technology. We extend RAMP by adding scaling specific parameters to enable workload-dependent lifetime reliability evaluation at different technologies. We show that (1) scaling has a significant impact on processor hard failure rates - on average, with SPEC benchmarks, we find the failure rate of a scaled 65nm processor to be 316% higher than a similarly pipelined 180nm processor; (2) time-dependent dielectric breakdown and electromigration have the largest increases; and (3) with scaling, the difference in reliability from running at worst-case vs. typical workload operating conditions increases significantly, as does the difference from running different workloads. Our results imply that leveraging a single microarchitecture design for multiple remaps across a few technology generations will become increasingly difficult, and motivate a need for workload specific, microarchitectural lifetime reliability awareness at an early design stage.\",\"PeriodicalId\":436323,\"journal\":{\"name\":\"International Conference on Dependable Systems and Networks, 2004\",\"volume\":\"125 5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"611\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Dependable Systems and Networks, 2004\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSN.2004.1311888\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Dependable Systems and Networks, 2004","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2004.1311888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 611
摘要
在过去的三十年里,CMOS技术的不断扩展为处理器性能提供了稳定的增长。然而,增加的功率密度(因此温度)和其他缩放效应对处理器的长期寿命可靠性有不利影响。考虑到工作负载特征,本文首次尝试量化由于固有硬错误而导致的扩展对寿命可靠性的影响。对于我们的定量评估,我们使用RAMP (Srinivasan et al., 2004),这是先前提出的工业强度模型,提供了工作负载的可靠性估计,但仅限于给定的技术。我们通过添加可缩放的特定参数来扩展RAMP,以支持在不同技术下进行与工作负载相关的寿命可靠性评估。我们发现(1)缩放对处理器硬故障率有显著影响——平均而言,在SPEC基准测试中,我们发现缩放后的65nm处理器的故障率比类似流水线的180nm处理器高316%;(2)随时间变化的介质击穿和电迁移增加幅度最大;(3)通过扩展,在最坏情况下与典型工作负载操作条件下运行的可靠性差异显着增加,运行不同工作负载的差异也显着增加。我们的研究结果表明,利用单个微架构设计实现跨几代技术的多个映射将变得越来越困难,并且在早期设计阶段就需要特定于工作负载的微架构生命周期可靠性意识。
The impact of technology scaling on lifetime reliability
The relentless scaling of CMOS technology has provided a steady increase in processor performance for the past three decades. However, increased power densities (hence temperatures) and other scaling effects have an adverse impact on long-term processor lifetime reliability. This paper represents a first attempt at quantifying the impact of scaling on lifetime reliability due to intrinsic hard errors, taking workload characteristics into consideration. For our quantitative evaluation, we use RAMP (Srinivasan et al., 2004), a previously proposed industrial-strength model that provides reliability estimates for a workload, but for a given technology. We extend RAMP by adding scaling specific parameters to enable workload-dependent lifetime reliability evaluation at different technologies. We show that (1) scaling has a significant impact on processor hard failure rates - on average, with SPEC benchmarks, we find the failure rate of a scaled 65nm processor to be 316% higher than a similarly pipelined 180nm processor; (2) time-dependent dielectric breakdown and electromigration have the largest increases; and (3) with scaling, the difference in reliability from running at worst-case vs. typical workload operating conditions increases significantly, as does the difference from running different workloads. Our results imply that leveraging a single microarchitecture design for multiple remaps across a few technology generations will become increasingly difficult, and motivate a need for workload specific, microarchitectural lifetime reliability awareness at an early design stage.