Manish Gupta, D. Roberts, Mitesh R. Meswani, Vilas Sridharan, D. Tullsen, Rajesh K. Gupta
{"title":"Reliability and Performance Trade-off Study of Heterogeneous Memories","authors":"Manish Gupta, D. Roberts, Mitesh R. Meswani, Vilas Sridharan, D. Tullsen, Rajesh K. Gupta","doi":"10.1145/2989081.2989113","DOIUrl":null,"url":null,"abstract":"Heterogeneous memories, organized as die-stacked in-package and off-package memory, have been a focus of attention by the computer architects to improve memory bandwidth and capacity. Researchers have explored methods and organizations to optimize performance by increasing the access rate to faster die-stacked memory. Unfortunately, reliability of such arrangements has not been studied carefully thus making them less attractive for data centers and mission-critical systems. Field studies show memory reliability depends on device physics as well as on error correction codes (ECC). Due to the capacity, latency, and energy costs of ECC, the performance-critical in-package memories may favor weaker ECC solutions than off-chip. Moreover, these systems are optimized to run at peak performance by increasing access rate to high-performance in-package memory. In this paper, authors use the real-world DRAM failure data to conduct a trade-off study on reliability and performance of Heterogeneous Memory Architectures (HMA). This paper illustrates the problem that an HMA system which only optimizes for performance may suffer from impaired reliability over time. This work also proposes an age-aware access rate control algorithm to ensure reliable operation of long-running systems.","PeriodicalId":283512,"journal":{"name":"Proceedings of the Second International Symposium on Memory Systems","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second International Symposium on Memory Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2989081.2989113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Heterogeneous memories, organized as die-stacked in-package and off-package memory, have been a focus of attention by the computer architects to improve memory bandwidth and capacity. Researchers have explored methods and organizations to optimize performance by increasing the access rate to faster die-stacked memory. Unfortunately, reliability of such arrangements has not been studied carefully thus making them less attractive for data centers and mission-critical systems. Field studies show memory reliability depends on device physics as well as on error correction codes (ECC). Due to the capacity, latency, and energy costs of ECC, the performance-critical in-package memories may favor weaker ECC solutions than off-chip. Moreover, these systems are optimized to run at peak performance by increasing access rate to high-performance in-package memory. In this paper, authors use the real-world DRAM failure data to conduct a trade-off study on reliability and performance of Heterogeneous Memory Architectures (HMA). This paper illustrates the problem that an HMA system which only optimizes for performance may suffer from impaired reliability over time. This work also proposes an age-aware access rate control algorithm to ensure reliable operation of long-running systems.