面向NVM-SSD混合存储的工作负载感知日志结构合并键值存储

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI:10.1109/ICDE55515.2023.00171

Lixiang Chen, Ruihao Chen, Chengcheng Yang, Yuxing Han, Rong Zhang, Xuan Zhou, Peiquan Jin, Weining Qian

{"title":"面向NVM-SSD混合存储的工作负载感知日志结构合并键值存储","authors":"Lixiang Chen, Ruihao Chen, Chengcheng Yang, Yuxing Han, Rong Zhang, Xuan Zhou, Peiquan Jin, Weining Qian","doi":"10.1109/ICDE55515.2023.00171","DOIUrl":null,"url":null,"abstract":"The log-structured merge tree (LSM-tree) has been widely adopted as a backbone of modern key-value stores. However, the multiple exponentially increased levels of LSM-tree makes it suffer from high write amplification. Existing studies often improve the write performance by sacrificing the read performance, which is inefficient to make trade-offs between the update and search efficiency. In this paper, we exploit nonvolatile memory (NVM) to address the write amplification issue for systems with NVM-SSD hybrid storage, and further propose a reinforcement learning method to navigate between update and search efficiency on the varying workloads. Specifically, we first propose a lightweight hot data identification method to efficiently capture access recency as well as frequency in NVM with relative large capacity. On this basis, we can eliminate different versions of frequently updated data in high-performance NVM without pushing them to SSD. To improve the data access locality and facilitate fine-grained index tuning in each level, we devise a virtual-split method to partition the key space gradually without extra write amplification. Finally, we propose a cost based Q-learning algorithm to adaptively tune the data organizations of each partition according to the changing access patterns. Experimental results show that our approach outperforms existing methods by up to 2.67×.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Workload-Aware Log-Structured Merge Key-Value Store for NVM-SSD Hybrid Storage\",\"authors\":\"Lixiang Chen, Ruihao Chen, Chengcheng Yang, Yuxing Han, Rong Zhang, Xuan Zhou, Peiquan Jin, Weining Qian\",\"doi\":\"10.1109/ICDE55515.2023.00171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The log-structured merge tree (LSM-tree) has been widely adopted as a backbone of modern key-value stores. However, the multiple exponentially increased levels of LSM-tree makes it suffer from high write amplification. Existing studies often improve the write performance by sacrificing the read performance, which is inefficient to make trade-offs between the update and search efficiency. In this paper, we exploit nonvolatile memory (NVM) to address the write amplification issue for systems with NVM-SSD hybrid storage, and further propose a reinforcement learning method to navigate between update and search efficiency on the varying workloads. Specifically, we first propose a lightweight hot data identification method to efficiently capture access recency as well as frequency in NVM with relative large capacity. On this basis, we can eliminate different versions of frequently updated data in high-performance NVM without pushing them to SSD. To improve the data access locality and facilitate fine-grained index tuning in each level, we devise a virtual-split method to partition the key space gradually without extra write amplification. Finally, we propose a cost based Q-learning algorithm to adaptively tune the data organizations of each partition according to the changing access patterns. Experimental results show that our approach outperforms existing methods by up to 2.67×.\",\"PeriodicalId\":434744,\"journal\":{\"name\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE55515.2023.00171\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

日志结构的合并树(LSM-tree)已被广泛采用为现代键值存储的主干。然而，LSM-tree的指数级增长使它受到高写放大的影响。现有的研究通常通过牺牲读性能来提高写性能，在更新效率和搜索效率之间进行权衡是低效的。在本文中，我们利用非易失性存储器(NVM)来解决NVM- ssd混合存储系统的写放大问题，并进一步提出一种强化学习方法来在不同工作负载下的更新和搜索效率之间进行导航。具体而言，我们首先提出了一种轻量级的热数据识别方法，以有效地捕获相对大容量NVM中的访问频次和频率。在此基础上，我们可以消除高性能NVM中频繁更新的数据的不同版本，而无需将其推入SSD。为了提高数据访问局部性和便于在每个级别上进行细粒度索引调优，我们设计了一种虚拟分割方法，在不增加额外写放大的情况下逐步划分键空间。最后，我们提出了一种基于代价的q -学习算法，可以根据访问模式的变化自适应地调整每个分区的数据组织。实验结果表明，我们的方法比现有方法的性能提高了2.67倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Workload-Aware Log-Structured Merge Key-Value Store for NVM-SSD Hybrid Storage

The log-structured merge tree (LSM-tree) has been widely adopted as a backbone of modern key-value stores. However, the multiple exponentially increased levels of LSM-tree makes it suffer from high write amplification. Existing studies often improve the write performance by sacrificing the read performance, which is inefficient to make trade-offs between the update and search efficiency. In this paper, we exploit nonvolatile memory (NVM) to address the write amplification issue for systems with NVM-SSD hybrid storage, and further propose a reinforcement learning method to navigate between update and search efficiency on the varying workloads. Specifically, we first propose a lightweight hot data identification method to efficiently capture access recency as well as frequency in NVM with relative large capacity. On this basis, we can eliminate different versions of frequently updated data in high-performance NVM without pushing them to SSD. To improve the data access locality and facilitate fine-grained index tuning in each level, we devise a virtual-split method to partition the key space gradually without extra write amplification. Finally, we propose a cost based Q-learning algorithm to adaptively tune the data organizations of each partition according to the changing access patterns. Experimental results show that our approach outperforms existing methods by up to 2.67×.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE 39th International Conference on Data Engineering (ICDE)

自引率

0.00%

发文量