System-level implications of disaggregated memory

IEEE International Symposium on High-Performance Comp Architecture Pub Date : 2012-02-25 DOI:10.1109/HPCA.2012.6168955

Kevin T. Lim, Yoshio Turner, J. R. Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, T. Wenisch

{"title":"System-level implications of disaggregated memory","authors":"Kevin T. Lim, Yoshio Turner, J. R. Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, T. Wenisch","doi":"10.1109/HPCA.2012.6168955","DOIUrl":null,"url":null,"abstract":"Recent research on memory disaggregation introduces a new architectural building block — the memory blade — as a cost-effective approach for memory capacity expansion and sharing for an ensemble of blade servers. Memory blades augment blade servers' local memory capacity with a second-level (remote) memory that can be dynamically apportioned among blades in response to changing capacity demand, albeit at a higher access latency. In this paper, we build on the prior research to explore the software and systems implications of disaggregated memory. We develop a software-based prototype by extending the Xen hypervisor to emulate a disaggregated memory design wherein remote pages are swapped into local memory on-demand upon access. Our prototyping effort reveals that low-latency remote memory calls for a different regime of replacement policies than conventional disk paging, favoring minimal hypervisor overhead even at the cost of using less sophisticated replacement policies. Second, we demonstrate the synergy between disaggregated memory and content-based page sharing. By allowing content to be shared both within and across blades (in local and remote memory, respectively), we find that their combination provides greater workload consolidation opportunity and performance-per-dollar than either technique alone. Finally, we explore a realistic deployment scenario in which disaggregated memory is used to reduce the scaling cost of a memcached system. We show that disaggregated memory can provide a 50% improvement in performance-per-dollar relative to conventional scale-out.","PeriodicalId":380383,"journal":{"name":"IEEE International Symposium on High-Performance Comp Architecture","volume":"395 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"194","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Comp Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2012.6168955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 194

Abstract

Recent research on memory disaggregation introduces a new architectural building block — the memory blade — as a cost-effective approach for memory capacity expansion and sharing for an ensemble of blade servers. Memory blades augment blade servers' local memory capacity with a second-level (remote) memory that can be dynamically apportioned among blades in response to changing capacity demand, albeit at a higher access latency. In this paper, we build on the prior research to explore the software and systems implications of disaggregated memory. We develop a software-based prototype by extending the Xen hypervisor to emulate a disaggregated memory design wherein remote pages are swapped into local memory on-demand upon access. Our prototyping effort reveals that low-latency remote memory calls for a different regime of replacement policies than conventional disk paging, favoring minimal hypervisor overhead even at the cost of using less sophisticated replacement policies. Second, we demonstrate the synergy between disaggregated memory and content-based page sharing. By allowing content to be shared both within and across blades (in local and remote memory, respectively), we find that their combination provides greater workload consolidation opportunity and performance-per-dollar than either technique alone. Finally, we explore a realistic deployment scenario in which disaggregated memory is used to reduce the scaling cost of a memcached system. We show that disaggregated memory can provide a 50% improvement in performance-per-dollar relative to conventional scale-out.

查看原文本刊更多论文

分解内存的系统级含义

最近对内存分解的研究引入了一种新的体系结构构建块——内存刀片——作为一种经济有效的方法，用于刀片服务器集成的内存容量扩展和共享。内存刀片使用二级(远程)内存来增加刀片服务器的本地内存容量，该内存可以在刀片之间动态分配，以响应不断变化的容量需求，尽管访问延迟更高。在本文中，我们在前人研究的基础上，探讨了分解记忆的软件和系统意义。我们通过扩展Xen管理程序来开发基于软件的原型，以模拟分解内存设计，其中远程页面在访问时按需交换到本地内存中。我们的原型工作表明，低延迟远程内存调用的替换策略与传统的磁盘分页不同，即使以使用不太复杂的替换策略为代价，也有利于最小化管理程序开销。其次，我们展示了分解内存和基于内容的页面共享之间的协同作用。通过允许内容在刀片内部和刀片之间共享(分别在本地和远程内存中)，我们发现它们的组合比单独使用任何一种技术提供了更大的工作负载整合机会和每美元性能。最后，我们探讨了一个实际的部署场景，其中使用分解内存来降低memcached系统的扩展成本。我们表明，与传统的横向扩展相比，分解内存可以提供每美元性能50%的提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Symposium on High-Performance Comp Architecture

自引率

0.00%

发文量