CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture Pub Date : 2014-12-13 DOI:10.1109/MICRO.2014.63

Chiachen Chou, A. Jaleel, Moinuddin K. Qureshi

{"title":"CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache","authors":"Chiachen Chou, A. Jaleel, Moinuddin K. Qureshi","doi":"10.1109/MICRO.2014.63","DOIUrl":null,"url":null,"abstract":"This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memory or as a hardware-managed cache. Using stacked DRAM as part of main memory increases the effective capacity, but obtaining high performance from such a system requires Operating System (OS) support to migrate data at a page-granularity. Using stacked DRAM as a hardware cache has the advantages of being transparent to the OS and perform data management at a line-granularity but suffers from reduced main memory capacity. This is because the stacked DRAM cache is not part of the memory address space. Ideally, we want the stacked DRAM to contribute towards capacity of main memory, and still maintain the hardware-based fine-granularity of a cache. We propose CAMEO, a hardware-based Cache-like Memory Organization that not only makes stacked DRAM visible as part of the memory address space but also exploits data locality on a fine-grained basis. CAMEO retains recently accessed data lines in stacked DRAM and swaps out the victim line to off chip memory. Since CAMEO can change the physical location of a line dynamically, we propose a low overhead Line Location Table (LLT) that tracks the physical location of all data lines. We also propose an accurate Line Location Predictor (LLP) to avoid the serialization of the LLT look-up and memory access. We evaluate a system that has 4GB stacked memory and 12GB off-chip memory. Using stacked DRAM as a cache improves performance by 50%, using as part of main memory improves performance by 33%, whereas CAMEO improves performance by 78%. Our proposed design is very close to an idealized memory system that uses the 4GB stacked DRAM as a hardware-managed cache and also increases the main memory capacity by an additional 4GB.","PeriodicalId":6591,"journal":{"name":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","volume":"64 1","pages":"1-12"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"144","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICRO.2014.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 144

Abstract

This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memory or as a hardware-managed cache. Using stacked DRAM as part of main memory increases the effective capacity, but obtaining high performance from such a system requires Operating System (OS) support to migrate data at a page-granularity. Using stacked DRAM as a hardware cache has the advantages of being transparent to the OS and perform data management at a line-granularity but suffers from reduced main memory capacity. This is because the stacked DRAM cache is not part of the memory address space. Ideally, we want the stacked DRAM to contribute towards capacity of main memory, and still maintain the hardware-based fine-granularity of a cache. We propose CAMEO, a hardware-based Cache-like Memory Organization that not only makes stacked DRAM visible as part of the memory address space but also exploits data locality on a fine-grained basis. CAMEO retains recently accessed data lines in stacked DRAM and swaps out the victim line to off chip memory. Since CAMEO can change the physical location of a line dynamically, we propose a low overhead Line Location Table (LLT) that tracks the physical location of all data lines. We also propose an accurate Line Location Predictor (LLP) to avoid the serialization of the LLT look-up and memory access. We evaluate a system that has 4GB stacked memory and 12GB off-chip memory. Using stacked DRAM as a cache improves performance by 50%, using as part of main memory improves performance by 33%, whereas CAMEO improves performance by 78%. Our proposed design is very close to an idealized memory system that uses the 4GB stacked DRAM as a hardware-managed cache and also increases the main memory capacity by an additional 4GB.

查看原文本刊更多论文

CAMEO:具有主存储器容量和硬件管理缓存灵活性的两级存储器组织

本文分析了将堆叠DRAM架构为主存储器的一部分或作为硬件管理的缓存的权衡。使用堆叠DRAM作为主内存的一部分可以增加有效容量，但是从这样的系统获得高性能需要操作系统(OS)支持以页粒度迁移数据。使用堆叠DRAM作为硬件缓存具有对操作系统透明的优点，并且可以按行粒度执行数据管理，但会减少主存容量。这是因为堆叠的DRAM缓存不是内存地址空间的一部分。理想情况下，我们希望堆叠的DRAM能够增加主存的容量，同时仍然保持基于硬件的细粒度缓存。我们提出了CAMEO，一种基于硬件的类似缓存的内存组织，它不仅使堆叠的DRAM作为内存地址空间的一部分可见，而且还在细粒度的基础上利用数据局部性。CAMEO在堆叠的DRAM中保留最近访问的数据线，并将受害线交换到片外存储器。由于CAMEO可以动态地改变线路的物理位置，我们提出了一个低架空线路位置表(LLT)来跟踪所有数据线的物理位置。我们还提出了一个精确的行定位预测器(LLP)，以避免LLT查找和内存访问的序列化。我们评估了一个拥有4GB堆叠内存和12GB片外内存的系统。使用堆叠DRAM作为缓存可以提高50%的性能，使用它作为主存储器的一部分可以提高33%的性能，而使用CAMEO可以提高78%的性能。我们提出的设计非常接近理想的内存系统，使用4GB堆叠DRAM作为硬件管理的缓存，并将主内存容量额外增加4GB。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture

自引率

0.00%

发文量