Evaluation of Non-Volatile Memory Based Last Level Cache Given Modern Use Case Behavior

2019 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2019-11-01 DOI:10.1109/IISWC47752.2019.9042051

Alexander Hankin, Tomer Shapira, K. Sangaiah, Michael Lui, Mark Hempstead

{"title":"Evaluation of Non-Volatile Memory Based Last Level Cache Given Modern Use Case Behavior","authors":"Alexander Hankin, Tomer Shapira, K. Sangaiah, Michael Lui, Mark Hempstead","doi":"10.1109/IISWC47752.2019.9042051","DOIUrl":null,"url":null,"abstract":"To confront the memory wall and keep up with the demands of changing use cases, Non-Volatile Memories (NVMs) have begun to be considered as a replacement for SRAM in the Last Level Cache (LLC). Recent work has shown that the small cell size of NVMs like Spin- Torque Transfer RAM (STTRAM) and Resistive RAM (RRAM) allows designers to build significantly denser LLCs than those with SRAM-based cells. In some cases, this allows for storing up to 10× more data on-chip than before. As the working set size of use cases increases with the advent of statistical inference (e.g., machine learning (ML) and artificial intelligence (AI)), more capacity close to the processor is necessary to keep up with the demand for performance and low power. Despite the growing potential of NVM-based LLCs, there are still fundamental problems that need to be addressed. First, the research community is lacking a methodology for consistently modeling these devices, which leads to apples-to-oranges comparisons across NVM-based LLCs. Second, NVMs exhibit a key operational difference with SRAM: read and write asymmetry. The effects of this asymmetry on use case performance and power are mostly unknown with prior art relying only on total read and write counts and on limited sets of use cases. In this work we present two novel contributions: (1) a set of heuristics for modeling emerging NVM-based LLCs, and (2) a workload characterization framework that learns how architecture-agnostic features, like entropy and working set size, affect the performance and power of a NVM-based LLC system for different use cases. In addition, with this work we release our NVM cell models and make them publicly available online. Using our NVM-based LLC models we show that NVM-based LLC energy use is up to an order of magnitude less than that of an SRAM-based LLC while ED2p is generally on par. From our workload characterization framework, we show that for the AI use cases, energy and speedup are 99% correlated with write entropy, 90% write footprint, and unique write footprint while negligibly correlated with total read and write footprint.","PeriodicalId":121068,"journal":{"name":"2019 IEEE International Symposium on Workload Characterization (IISWC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC47752.2019.9042051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

To confront the memory wall and keep up with the demands of changing use cases, Non-Volatile Memories (NVMs) have begun to be considered as a replacement for SRAM in the Last Level Cache (LLC). Recent work has shown that the small cell size of NVMs like Spin- Torque Transfer RAM (STTRAM) and Resistive RAM (RRAM) allows designers to build significantly denser LLCs than those with SRAM-based cells. In some cases, this allows for storing up to 10× more data on-chip than before. As the working set size of use cases increases with the advent of statistical inference (e.g., machine learning (ML) and artificial intelligence (AI)), more capacity close to the processor is necessary to keep up with the demand for performance and low power. Despite the growing potential of NVM-based LLCs, there are still fundamental problems that need to be addressed. First, the research community is lacking a methodology for consistently modeling these devices, which leads to apples-to-oranges comparisons across NVM-based LLCs. Second, NVMs exhibit a key operational difference with SRAM: read and write asymmetry. The effects of this asymmetry on use case performance and power are mostly unknown with prior art relying only on total read and write counts and on limited sets of use cases. In this work we present two novel contributions: (1) a set of heuristics for modeling emerging NVM-based LLCs, and (2) a workload characterization framework that learns how architecture-agnostic features, like entropy and working set size, affect the performance and power of a NVM-based LLC system for different use cases. In addition, with this work we release our NVM cell models and make them publicly available online. Using our NVM-based LLC models we show that NVM-based LLC energy use is up to an order of magnitude less than that of an SRAM-based LLC while ED2p is generally on par. From our workload characterization framework, we show that for the AI use cases, energy and speedup are 99% correlated with write entropy, 90% write footprint, and unique write footprint while negligibly correlated with total read and write footprint.

查看原文本刊更多论文

基于最后一级缓存的非易失性存储器在现代用例行为下的评估

为了面对内存墙并跟上不断变化的用例需求，非易失性存储器(nvm)已经开始被认为是最后一级缓存(LLC)中SRAM的替代品。最近的研究表明，与基于sram的单元相比，像自旋扭矩传递RAM (stram)和电阻式RAM (RRAM)这样的nvm的小单元可以让设计人员构建密度更大的llc。在某些情况下，这允许在芯片上存储比以前多10倍的数据。随着统计推断(例如，机器学习(ML)和人工智能(AI))的出现，用例的工作集大小增加，需要更多靠近处理器的容量来跟上对性能和低功耗的需求。尽管基于nvm的有限责任公司的潜力越来越大，但仍有一些基本问题需要解决。首先，研究界缺乏对这些设备进行一致建模的方法，这导致了基于nvm的有限责任公司之间的苹果和橘子比较。其次，nvm与SRAM表现出一个关键的操作差异:读写不对称。这种不对称对用例性能和功率的影响在很大程度上是未知的，因为现有技术只依赖于总的读写计数和有限的用例集。在这项工作中，我们提出了两个新颖的贡献:(1)一组用于对新兴的基于nvm的LLC进行建模的启发式方法，以及(2)一个工作负载表征框架，该框架学习了与架构无关的特征(如熵和工作集大小)如何影响基于nvm的LLC系统在不同用例中的性能和功率。此外，通过这项工作，我们发布了我们的NVM单元模型，并使它们在网上公开可用。使用我们基于nvm的LLC模型，我们表明基于nvm的LLC的能源使用比基于sram的LLC少一个数量级，而ED2p通常是同等的。从我们的工作负载表征框架中，我们表明，对于人工智能用例，能源和加速与写熵有99%的相关性，90%的写足迹和唯一写足迹，而与总读写足迹的相关性可以忽略不计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Symposium on Workload Characterization (IISWC)

自引率

0.00%

发文量