Power-optimized Deployment of Key-value Stores Using Storage Class Memory

ACM Transactions on Storage (TOS) Pub Date : 2022-03-10 DOI:10.1145/3511905

H. Kassa, Jason B. Akers, Mrinmoy Ghosh, Zhichao Cao, V. Gogte, R. Dreslinski

{"title":"Power-optimized Deployment of Key-value Stores Using Storage Class Memory","authors":"H. Kassa, Jason B. Akers, Mrinmoy Ghosh, Zhichao Cao, V. Gogte, R. Dreslinski","doi":"10.1145/3511905","DOIUrl":null,"url":null,"abstract":"High-performance flash-based key-value stores in data-centers utilize large amounts of DRAM to cache hot data. However, motivated by the high cost and power consumption of DRAM, server designs with lower DRAM-per-compute ratio are becoming popular. These low-cost servers enable scale-out services by reducing server workload densities. This results in improvements to overall service reliability, leading to a decrease in the total cost of ownership (TCO) for scalable workloads. Nevertheless, for key-value stores with large memory footprints, these reduced DRAM servers degrade performance due to an increase in both IO utilization and data access latency. In this scenario, a standard practice to improve performance for sharded databases is to reduce the number of shards per machine, which degrades the TCO benefits of reduced DRAM low-cost servers. In this work, we explore a practical solution to improve performance and reduce the costs and power consumption of key-value stores running on DRAM-constrained servers by using Storage Class Memories (SCM). SCMs in a DIMM form factor, although slower than DRAM, are sufficiently faster than flash when serving as a large extension to DRAM. With new technologies like Compute Express Link, we can expand the memory capacity of servers with high bandwidth and low latency connectivity with SCM. In this article, we use Intel Optane PMem 100 Series SCMs (DCPMM) in AppDirect mode to extend the available memory of our existing single-socket platform deployment of RocksDB (one of the largest key-value stores at Meta). We first designed a hybrid cache in RocksDB to harness both DRAM and SCM hierarchically. We then characterized the performance of the hybrid cache for three of the largest RocksDB use cases at Meta (ChatApp, BLOB Metadata, and Hive Cache). Our results demonstrate that we can achieve up to 80% improvement in throughput and 20% improvement in P95 latency over the existing small DRAM single-socket platform, while maintaining a 43–48% cost improvement over our large DRAM dual-socket platform. To the best of our knowledge, this is the first study of the DCPMM platform in a commercial data center.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

High-performance flash-based key-value stores in data-centers utilize large amounts of DRAM to cache hot data. However, motivated by the high cost and power consumption of DRAM, server designs with lower DRAM-per-compute ratio are becoming popular. These low-cost servers enable scale-out services by reducing server workload densities. This results in improvements to overall service reliability, leading to a decrease in the total cost of ownership (TCO) for scalable workloads. Nevertheless, for key-value stores with large memory footprints, these reduced DRAM servers degrade performance due to an increase in both IO utilization and data access latency. In this scenario, a standard practice to improve performance for sharded databases is to reduce the number of shards per machine, which degrades the TCO benefits of reduced DRAM low-cost servers. In this work, we explore a practical solution to improve performance and reduce the costs and power consumption of key-value stores running on DRAM-constrained servers by using Storage Class Memories (SCM). SCMs in a DIMM form factor, although slower than DRAM, are sufficiently faster than flash when serving as a large extension to DRAM. With new technologies like Compute Express Link, we can expand the memory capacity of servers with high bandwidth and low latency connectivity with SCM. In this article, we use Intel Optane PMem 100 Series SCMs (DCPMM) in AppDirect mode to extend the available memory of our existing single-socket platform deployment of RocksDB (one of the largest key-value stores at Meta). We first designed a hybrid cache in RocksDB to harness both DRAM and SCM hierarchically. We then characterized the performance of the hybrid cache for three of the largest RocksDB use cases at Meta (ChatApp, BLOB Metadata, and Hive Cache). Our results demonstrate that we can achieve up to 80% improvement in throughput and 20% improvement in P95 latency over the existing small DRAM single-socket platform, while maintaining a 43–48% cost improvement over our large DRAM dual-socket platform. To the best of our knowledge, this is the first study of the DCPMM platform in a commercial data center.

查看原文本刊更多论文

使用存储类内存的键值存储的功耗优化部署

数据中心中基于闪存的高性能键值存储利用大量的DRAM来缓存热数据。然而，由于DRAM的高成本和高功耗，低每计算DRAM比的服务器设计越来越受欢迎。这些低成本服务器通过降低服务器工作负载密度来支持向外扩展服务。这将提高整体服务可靠性，从而降低可扩展工作负载的总拥有成本(TCO)。然而，对于占用大量内存的键值存储，由于IO利用率和数据访问延迟的增加，这些减少的DRAM服务器会降低性能。在这种情况下，提高分片数据库性能的标准做法是减少每台机器的分片数量，这会降低减少DRAM低成本服务器的TCO优势。在这项工作中，我们探索了一种实用的解决方案，通过使用存储类内存(SCM)来提高性能并降低运行在dram受限服务器上的键值存储的成本和功耗。DIMM形式的scm虽然比DRAM慢，但在作为DRAM的大型扩展时，比闪存足够快。通过Compute Express Link等新技术，我们可以通过高带宽和低延迟连接SCM来扩展服务器的内存容量。在本文中，我们在AppDirect模式下使用英特尔Optane PMem 100系列scm (DCPMM)来扩展现有的单套接字平台部署的RocksDB (Meta中最大的键值存储之一)的可用内存。我们首先在RocksDB中设计了一个混合缓存，以分层利用DRAM和SCM。然后，我们对Meta上三个最大的RocksDB用例(ChatApp、BLOB元数据和Hive缓存)的混合缓存性能进行了表征。我们的结果表明，与现有的小型DRAM单插槽平台相比，我们可以实现高达80%的吞吐量改进和20%的P95延迟改进，同时与我们的大型DRAM双插槽平台相比，我们可以保持43-48%的成本改进。据我们所知，这是首次在商业数据中心中研究DCPMM平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量