现代DRAM芯片中设计引起的延迟变化:表征、分析和延迟减少机制

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems Pub Date : 2017-06-05 DOI:10.1145/3078505.3078533

Donghyuk Lee, S. Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, V. Seshadri, O. Mutlu

{"title":"现代DRAM芯片中设计引起的延迟变化:表征、分析和延迟减少机制","authors":"Donghyuk Lee, S. Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, V. Seshadri, O. Mutlu","doi":"10.1145/3078505.3078533","DOIUrl":null,"url":null,"abstract":"Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of variation that exists within a real DRAM chip, induced by the design and placement of different components in the DRAM chip: different regions in DRAM, based on their relative distances from the peripheral structures, require different minimum access latencies for reliable operation. In particular, we show that in most real DRAM chips, cells closer to the peripheral structures can be accessed much faster than cells that are farther. We call this phenomenon design-induced variation in DRAM. Our goals are to i) understand design-induced variation that exists in real, state-of-the-art DRAM chips, ii) exploit it to develop low-cost mechanisms that can dynamically find and use the lowest latency at which to operate a DRAM chip reliably, and, thus, iii) improve overall system performance while ensuring reliable system operation. To this end, we first experimentally demonstrate and analyze designed-induced variation in modern DRAM devices by testing and characterizing 96 DIMMs (768 DRAM chips). Our experimental study shows that i) modern DRAM chips exhibit design-induced latency variation in both row and column directions, ii) access latency gradually increases in the row direction within a DRAM cell array (mat) and this pattern repeats in every mat, and iii) some columns require higher latency than others due to the internal hierarchical organization of the DRAM chip. Our characterization identifies DRAM regions that are vulnerable to errors, if operated at lower latency, and finds consistency in their locations across a given DRAM chip generation, due to design-induced variation. Variations in the vertical and horizontal dimensions, together, divide the cell array into heterogeneous-latency regions, where cells in some regions require longer access latencies for reliable operation. Reducing the latency uniformly across all regions in DRAM would improve performance, but can introduce failures in the inherently slower regions that require longer access latencies for correct operation. We refer to these inherently slower regions of DRAM as design-induced vulnerable regions. Based on our extensive experimental analysis, we develop two mechanisms that reliably reduce DRAM latency. First, DIVI Profiling uses runtime profiling to dynamically identify the lowest DRAM latency that does not introduce failures. DIVA Profiling exploits design-induced variation and periodically profiles only the vulnerable regions to determine the lowest DRAM latency at low cost. It is the first mechanism to dynamically determine the lowest latency that can be used to operate DRAM reliably. DIVA Profiling reduces the latency of read/write requests by 35.1%/57.8%, respectively, at 55C. Our second mechanism, DIVA Shuffling, shuffles data such that values stored in vulnerable regions are mapped to multiple error-correcting code (ECC) codewords. As a result, DIVA Shuffling can correct 26% more multi-bit errors than conventional ECC. Combined together, our two mechanisms reduce read/write latency by 40.0%/60.5%, which translates to an overall system performance improvement of 14.7%/13.7%/13.8% (in 2-/4-/8-core systems) over a variety of workloads, while ensuring reliable operation.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"112","resultStr":"{\"title\":\"Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms\",\"authors\":\"Donghyuk Lee, S. Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, V. Seshadri, O. Mutlu\",\"doi\":\"10.1145/3078505.3078533\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of variation that exists within a real DRAM chip, induced by the design and placement of different components in the DRAM chip: different regions in DRAM, based on their relative distances from the peripheral structures, require different minimum access latencies for reliable operation. In particular, we show that in most real DRAM chips, cells closer to the peripheral structures can be accessed much faster than cells that are farther. We call this phenomenon design-induced variation in DRAM. Our goals are to i) understand design-induced variation that exists in real, state-of-the-art DRAM chips, ii) exploit it to develop low-cost mechanisms that can dynamically find and use the lowest latency at which to operate a DRAM chip reliably, and, thus, iii) improve overall system performance while ensuring reliable system operation. To this end, we first experimentally demonstrate and analyze designed-induced variation in modern DRAM devices by testing and characterizing 96 DIMMs (768 DRAM chips). Our experimental study shows that i) modern DRAM chips exhibit design-induced latency variation in both row and column directions, ii) access latency gradually increases in the row direction within a DRAM cell array (mat) and this pattern repeats in every mat, and iii) some columns require higher latency than others due to the internal hierarchical organization of the DRAM chip. Our characterization identifies DRAM regions that are vulnerable to errors, if operated at lower latency, and finds consistency in their locations across a given DRAM chip generation, due to design-induced variation. Variations in the vertical and horizontal dimensions, together, divide the cell array into heterogeneous-latency regions, where cells in some regions require longer access latencies for reliable operation. Reducing the latency uniformly across all regions in DRAM would improve performance, but can introduce failures in the inherently slower regions that require longer access latencies for correct operation. We refer to these inherently slower regions of DRAM as design-induced vulnerable regions. Based on our extensive experimental analysis, we develop two mechanisms that reliably reduce DRAM latency. First, DIVI Profiling uses runtime profiling to dynamically identify the lowest DRAM latency that does not introduce failures. DIVA Profiling exploits design-induced variation and periodically profiles only the vulnerable regions to determine the lowest DRAM latency at low cost. It is the first mechanism to dynamically determine the lowest latency that can be used to operate DRAM reliably. DIVA Profiling reduces the latency of read/write requests by 35.1%/57.8%, respectively, at 55C. Our second mechanism, DIVA Shuffling, shuffles data such that values stored in vulnerable regions are mapped to multiple error-correcting code (ECC) codewords. As a result, DIVA Shuffling can correct 26% more multi-bit errors than conventional ECC. Combined together, our two mechanisms reduce read/write latency by 40.0%/60.5%, which translates to an overall system performance improvement of 14.7%/13.7%/13.8% (in 2-/4-/8-core systems) over a variety of workloads, while ensuring reliable operation.\",\"PeriodicalId\":133673,\"journal\":{\"name\":\"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"112\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3078505.3078533\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078505.3078533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 112

摘要

变异已被证明存在于现代DRAM芯片的各个单元之间。先前的工作已经研究和利用了几种形式的变化，如制造过程或温度引起的变化。我们通过经验证明了一种新的变化形式，这种变化存在于真实的DRAM芯片中，由DRAM芯片中不同组件的设计和放置引起:DRAM中的不同区域，基于它们与外围结构的相对距离，需要不同的最小访问延迟以实现可靠的操作。特别是，我们表明，在大多数实际的DRAM芯片中，更接近外围结构的单元可以比更远的单元更快地被访问。我们把这种现象称为DRAM中的设计诱导变异。我们的目标是i)理解存在于真实的、最先进的DRAM芯片中的设计引起的变化，ii)利用它来开发低成本机制，可以动态地发现并使用最低的延迟来可靠地操作DRAM芯片，因此，iii)在确保可靠系统运行的同时提高整体系统性能。为此，我们首先通过测试和表征96个dimm(768个DRAM芯片)，实验证明和分析了现代DRAM器件中设计引起的变化。我们的实验研究表明，i)现代DRAM芯片在行和列方向上都表现出设计引起的延迟变化，ii)在DRAM单元阵列(mat)中，访问延迟在行方向上逐渐增加，并且这种模式在每个mat中重复，iii)由于DRAM芯片的内部分层组织，一些列需要比其他列更高的延迟。我们的特性确定了易受错误影响的DRAM区域，如果在较低的延迟下操作，并在给定的DRAM芯片生成中发现其位置的一致性，这是由于设计引起的变化。垂直和水平维度的变化一起将单元阵列划分为异构延迟区域，其中某些区域的单元需要更长的访问延迟才能进行可靠的操作。统一地减少DRAM中所有区域的延迟可以提高性能，但可能会在固有的较慢区域中引入故障，这些区域需要更长的访问延迟才能正确操作。我们将这些天生较慢的DRAM区域称为设计诱发的脆弱区域。基于我们广泛的实验分析，我们开发了两种可靠地减少DRAM延迟的机制。首先，DIVI分析使用运行时分析来动态识别不会引入故障的最低DRAM延迟。DIVA分析利用设计引起的变化，周期性地只分析易受攻击的区域，以低成本确定最低的DRAM延迟。它是第一个动态确定最低延迟的机制，可以用来可靠地操作DRAM。DIVA分析在55C时分别将读/写请求的延迟减少了35.1%/57.8%。我们的第二种机制，DIVA洗牌，洗牌数据，这样存储在脆弱区域的值被映射到多个纠错码(ECC)码字。因此，DIVA变换可以比传统的ECC多纠正26%的多比特错误。结合在一起，我们的两种机制将读/写延迟减少了40.0%/60.5%，这意味着在各种工作负载下，整体系统性能提高了14.7%/13.7%/13.8%(在2核/4核/8核系统中)，同时确保了可靠的操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms

Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of variation that exists within a real DRAM chip, induced by the design and placement of different components in the DRAM chip: different regions in DRAM, based on their relative distances from the peripheral structures, require different minimum access latencies for reliable operation. In particular, we show that in most real DRAM chips, cells closer to the peripheral structures can be accessed much faster than cells that are farther. We call this phenomenon design-induced variation in DRAM. Our goals are to i) understand design-induced variation that exists in real, state-of-the-art DRAM chips, ii) exploit it to develop low-cost mechanisms that can dynamically find and use the lowest latency at which to operate a DRAM chip reliably, and, thus, iii) improve overall system performance while ensuring reliable system operation. To this end, we first experimentally demonstrate and analyze designed-induced variation in modern DRAM devices by testing and characterizing 96 DIMMs (768 DRAM chips). Our experimental study shows that i) modern DRAM chips exhibit design-induced latency variation in both row and column directions, ii) access latency gradually increases in the row direction within a DRAM cell array (mat) and this pattern repeats in every mat, and iii) some columns require higher latency than others due to the internal hierarchical organization of the DRAM chip. Our characterization identifies DRAM regions that are vulnerable to errors, if operated at lower latency, and finds consistency in their locations across a given DRAM chip generation, due to design-induced variation. Variations in the vertical and horizontal dimensions, together, divide the cell array into heterogeneous-latency regions, where cells in some regions require longer access latencies for reliable operation. Reducing the latency uniformly across all regions in DRAM would improve performance, but can introduce failures in the inherently slower regions that require longer access latencies for correct operation. We refer to these inherently slower regions of DRAM as design-induced vulnerable regions. Based on our extensive experimental analysis, we develop two mechanisms that reliably reduce DRAM latency. First, DIVI Profiling uses runtime profiling to dynamically identify the lowest DRAM latency that does not introduce failures. DIVA Profiling exploits design-induced variation and periodically profiles only the vulnerable regions to determine the lowest DRAM latency at low cost. It is the first mechanism to dynamically determine the lowest latency that can be used to operate DRAM reliably. DIVA Profiling reduces the latency of read/write requests by 35.1%/57.8%, respectively, at 55C. Our second mechanism, DIVA Shuffling, shuffles data such that values stored in vulnerable regions are mapped to multiple error-correcting code (ECC) codewords. As a result, DIVA Shuffling can correct 26% more multi-bit errors than conventional ECC. Combined together, our two mechanisms reduce read/write latency by 40.0%/60.5%, which translates to an overall system performance improvement of 14.7%/13.7%/13.8% (in 2-/4-/8-core systems) over a variety of workloads, while ensuring reliable operation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

自引率

0.00%

发文量