ACM Transactions on Storage最新文献

筛选
英文 中文
Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement 在混合 KV 放置的 LSM 键值存储中实现高效复制的索引运输
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-04-16 DOI: 10.1145/3658672
Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas
{"title":"Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement","authors":"Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas","doi":"10.1145/3658672","DOIUrl":"https://doi.org/10.1145/3658672","url":null,"abstract":"<p>Key-value (KV) stores based on LSM tree have become a foundational layer in the storage stack of datacenters and cloud services. Current approaches for achieving reliability and availability favor reducing network traffic and send to replicas only new KV pairs. As a result, they perform costly compactions to reorganize data in both the primary and backup nodes, which increases device I/O traffic and CPU overhead, and eventually hurts overall system performance. In this paper we describe <i>Tebis</i>, an efficient LSM-based KV store that reduces I/O amplification and CPU overhead for maintaining the replica index. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends pre-built indexes to backup nodes, avoiding all compactions in backup nodes. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the pre-built region index. Our results show that <i>Tebis</i>\u0000reduces resource utilization on backup nodes compared to performing full compactions: Throughput is increased by 1.06 − 2.90 ×, CPU efficiency is increased by 1.21 − 2.78 ×, and I/O amplification is reduced by 1.7 − 3.27 ×, while network traffic increases by up to 1.32 − 3.76 ×.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device Utilization eZNS:增强性能隔离和设备利用率的弹性分区命名空间
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-04-12 DOI: 10.1145/3653716
Jaehong Min, Chenxingyu Zhao, Ming Liu, Arvind Krishnamurthy
{"title":"eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device Utilization","authors":"Jaehong Min, Chenxingyu Zhao, Ming Liu, Arvind Krishnamurthy","doi":"10.1145/3653716","DOIUrl":"https://doi.org/10.1145/3653716","url":null,"abstract":"<p>Emerging Zoned Namespace (ZNS) SSDs, providing the coarse-grained zone abstraction, hold the potential to significantly enhance the cost-efficiency of future storage infrastructure and mitigate performance unpredictability. However, existing ZNS SSDs have a static zoned interface, making them in-adaptable to workload runtime behavior, unscalable to underlying hardware capabilities, and interfering with co-located zones. Applications either under-provision the zone resources yielding unsatisfied throughput, create over-provisioned zones and incur costs, or experience unexpected I/O latencies. </p><p>We propose eZNS, an elastic-zoned namespace interface that exposes an adaptive zone with predictable characteristics. eZNS comprises two major components: a zone arbiter that manages zone allocation and active resources on the control plane, a hierarchical I/O scheduler with read congestion control, and write admission control on the data plane. Together, eZNS enables the transparent use of a ZNS SSD and closes the gap between application requirements and zone interface properties. Our evaluations over RocksDB demonstrate that eZNS outperforms a static zoned interface by 17.7% and 80.3% in throughput and tail latency, respectively, at most.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140576562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Contract-Aware and Cost-effective LSM Store for Cloud Storage with Low Latency Spikes 面向低延迟峰值云存储的合约感知型低成本高效率 LSM 存储器
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-02-20 DOI: 10.1145/3643851
Yuanhui Zhou, Jian Zhou, Kai Lu, Ling Zhan, Peng Xu, Peng Wu, Shuning Chen, Xian Liu, Jiguang Wan
{"title":"A Contract-Aware and Cost-effective LSM Store for Cloud Storage with Low Latency Spikes","authors":"Yuanhui Zhou, Jian Zhou, Kai Lu, Ling Zhan, Peng Xu, Peng Wu, Shuning Chen, Xian Liu, Jiguang Wan","doi":"10.1145/3643851","DOIUrl":"https://doi.org/10.1145/3643851","url":null,"abstract":"<p>Cloud storage is gaining popularity because features such as pay-as-you-go significantly reduce storage costs. However, the community has not sufficiently explored its contract model and latency characteristics. As LSM-Tree-based key-value stores (LSM stores) become the building block for numerous cloud applications, how cloud storage would impact the performance of key-value accesses is vital. This study reveals the significant latency variances of Amazon Elastic Block Store (EBS) under various I/O pressures, which challenges LSM store read performance on cloud storage. To reduce the corresponding tail latency, we propose Calcspar, a contract-aware LSM store for cloud storage, which efficiently addresses the challenges by regulating the rate of I/O requests to cloud storage and absorbing surplus I/O requests with the data cache. We specifically developed a fluctuation-aware cache to lower the high latency brought on by workload fluctuations. Additionally, we build a congestion-aware IOPS allocator to reduce the impact of LSM store internal operations on read latency. We evaluated Calcspar on EBS with different real-world workloads and compared it to the cutting-edge LSM stores. The results show that Calcspar can significantly reduce tail latency while maintaining regular read and write performance, keeping the 99<sup>th</sup> percentile latency under 550<i>μ</i>s and reducing average latency by 66%. In addition, Calcspar has lower write prices and average latency compared to Cloud NoSQL services offered by cloud vendors.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139956219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Introduction to the Special Section on USENIX ATC 2023 USENIX ATC 2023 特别分会简介
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-02-19 DOI: 10.1145/3635156
Dan Williams, Julia Lawall
{"title":"Introduction to the Special Section on USENIX ATC 2023","authors":"Dan Williams, Julia Lawall","doi":"10.1145/3635156","DOIUrl":"https://doi.org/10.1145/3635156","url":null,"abstract":"<p>No abstract available.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tarazu: An Adaptive End-to-End I/O Load Balancing Framework for Large-Scale Parallel File Systems Tarazu:面向大规模并行文件系统的自适应端到端 I/O 负载平衡框架
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-02-01 DOI: 10.1145/3641885
Arnab K. Paul, Sarah Neuwirth, Bharti Wadhwa, Feiyi Wang, Sarp Oral, Ali R. Butt
{"title":"Tarazu: An Adaptive End-to-End I/O Load Balancing Framework for Large-Scale Parallel File Systems","authors":"Arnab K. Paul, Sarah Neuwirth, Bharti Wadhwa, Feiyi Wang, Sarp Oral, Ali R. Butt","doi":"10.1145/3641885","DOIUrl":"https://doi.org/10.1145/3641885","url":null,"abstract":"<p>The imbalanced I/O load on large parallel file systems affects the parallel I/O performance of high-performance computing (HPC) applications. One of the main reasons for I/O imbalances is the lack of a global view of system-wide resource consumption. While approaches to address the problem already exist, the diversity of HPC workloads combined with different file striping patterns prevents widespread adoption of these approaches. In addition, load balancing techniques should be transparent to client applications. To address these issues, we propose Tarazu, an end-to-end control plane where clients transparently and adaptively write to a set of selected I/O servers to achieve balanced data placement. Our control plane leverages real-time load statistics for global data placement on distributed storage servers, while our design model employs trace-based optimization techniques to minimize latency for I/O load requests between clients and servers and to handle multiple striping patterns in files. We evaluate our proposed system on an experimental cluster for two common use cases: the synthetic I/O benchmark IOR and the scientific application I/O kernel HACC-I/O. We also use a discrete-time simulator with real HPC application traces from emerging workloads running on the Summit supercomputer to validate the effectiveness and scalability of Tarazu in large-scale storage environments. The results show improvements in load balancing and read performance of up to (33% ) and (43% ) percent, respectively, compared to the state of the art.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An End-to-End High-Performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems 针对 Docker 注册表和 Docker 容器存储系统的端到端高性能重复数据删除方案
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-30 DOI: 10.1145/3643819
Nannan Zhao, Muhui Lin, Hadeel Albahar, Arnab K. Paul, Zhijie Huang, Subil Abraham, Keren Chen, Vasily Tarasov, Dimitrios Skourtis, Ali Anwar, Ali R. Butt
{"title":"An End-to-End High-Performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems","authors":"Nannan Zhao, Muhui Lin, Hadeel Albahar, Arnab K. Paul, Zhijie Huang, Subil Abraham, Keren Chen, Vasily Tarasov, Dimitrios Skourtis, Ali Anwar, Ali R. Butt","doi":"10.1145/3643819","DOIUrl":"https://doi.org/10.1145/3643819","url":null,"abstract":"<p>The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place a high pressure on the infrastructure of <b>container registries</b> that store and distribute images and <b>container storage systems</b> on the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. </p><p>We propose DupHunter, an end-to-end deduplication scheme that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into 3 tiers: registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier to eliminate the redundant files from the client tier and avoid bringing deduplication overhead to the clients. To further reduce the data duplicates caused by CoWs and improve the container I/O performance, we utilize a container-aware storage system at the client tier that reserves space for each container and arranges the placement of files and their modifications on the disk to preserve locality. Under real workloads, DupHunter reduces storage space by up to 6.9 × and reduces the <monospace>GET</monospace> layer latency up to 2.8 × compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139646165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Data-pattern-aware Vertical Partitioning to Achieve Fast and Low-cost Cloud Log Storage 利用数据模式感知垂直分区实现快速、低成本的云日志存储
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-29 DOI: 10.1145/3643641
Junyu Wei, Guangyan Zhang, Junchao Chen, Yang Wang, Weimin Zheng, Tingtao Sun, Jiesheng Wu, Jiangwei Jiang
{"title":"Exploiting Data-pattern-aware Vertical Partitioning to Achieve Fast and Low-cost Cloud Log Storage","authors":"Junyu Wei, Guangyan Zhang, Junchao Chen, Yang Wang, Weimin Zheng, Tingtao Sun, Jiesheng Wu, Jiangwei Jiang","doi":"10.1145/3643641","DOIUrl":"https://doi.org/10.1145/3643641","url":null,"abstract":"<p>Cloud logs can be categorized into on-line, off-line, and near-line logs based on the access frequency. Among them, near-line logs are mainly used for debugging, which means they prefer a low query latency for better user experience. Besides, the storage system for near-line logs prefers a low overall cost including the storage cost to store compressed logs, and the computation cost to compress logs and execute queries. These requirements pose challenges to achieving fast and cheap cloud log storage. </p><p>This paper proposes LogGrep, the first log compression and query tool that exploits both static and runtime patterns to properly structurize and organize log data in fine-grained units. The key idea of LogGrep is “vertical partitioning”: it stores each log entry into multiple partitions by first parsing logs into variable vectors according to static patterns and then extracting runtime pattern(s) automatically within each variable vector. Based on such runtime patterns, LogGrep further decomposes the variable vectors into fine-grained units called “Capsules” and stamps each Capsule with a summary of its values. During the query process, LogGrep can avoid decompressing and scanning Capsules that cannot match the keywords, with the help of the extracted runtime patterns and the Capsule stamps. We further show that the interactive debugging can well utilize the advantages of the vertical-partitioning-based method and mitigate its weaknesses as well. To this end, LogGrep integrates incremental locating and partial reconstruction to mitigate the read amplification incurred by vertical-partitioning-based method. </p><p>We evaluate LogGrep on 37 cloud logs from the production environment of Alibaba Cloud and the public datasets. The results show that LogGrep can reduce the query latency and the overall cost by an order of magnitude compared with state-of-the-art works. Such results have confirmed that it is worthwhile applying a more sophisticated vertical-partitioning-based method to accelerate queries on compressed cloud logs.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139585234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search 为十亿级近邻搜索中的 CXL 内存分解架设软硬件桥梁
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-06 DOI: 10.1145/3639471
Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, Myoungsoo Jung
{"title":"Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search","authors":"Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, Myoungsoo Jung","doi":"10.1145/3639471","DOIUrl":"https://doi.org/10.1145/3639471","url":null,"abstract":"<p>We propose <i>CXL-ANNS</i>, a software-hardware collaborative approach to enable scalable approximate nearest neighbor search (ANNS) services. To this end, we first disaggregate DRAM from the host via compute express link (CXL) and place all essential datasets into its memory pool. While this CXL memory pool allows ANNS to handle billion-point graphs without an accuracy loss, we observe that the search performance significantly degrades because of CXL’s far-memory-like characteristics. To address this, CXL-ANNS considers the node-level relationship and caches the neighbors in local memory, which are expected to visit most frequently. For the uncached nodes, CXL-ANNS prefetches a set of nodes most likely to visit soon by understanding the graph traversing behaviors of ANNS. CXL-ANNS is also aware of the architectural structures of the CXL interconnect network and lets different hardware components collaborate with each other for the search. Further, it relaxes the execution dependency of neighbor search tasks and allows ANNS to utilize all hardware in the CXL network in parallel. </p><p>Our evaluation shows that CXL-ANNS exhibits 93.3% lower query latency than state-of-the-art ANNS platforms that we tested. CXL-ANNS also outperforms an oracle ANNS system that has unlimited local DRAM capacity by 68.0%, in terms of latency.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139372911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polling Sanitization to Balance I/O Latency and Data Security of High-density SSDs 轮询净化,平衡高密度固态硬盘的 I/O 延迟和数据安全
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-06 DOI: 10.1145/3639826
Jiaojiao Wu, Zhigang Cai, Fan Yang, Jun Li, Francois Trahay, Zheng Yang, Chao Wang, Jianwei Liao
{"title":"Polling Sanitization to Balance I/O Latency and Data Security of High-density SSDs","authors":"Jiaojiao Wu, Zhigang Cai, Fan Yang, Jun Li, Francois Trahay, Zheng Yang, Chao Wang, Jianwei Liao","doi":"10.1145/3639826","DOIUrl":"https://doi.org/10.1145/3639826","url":null,"abstract":"<p>Sanitization is an effective approach for ensuring data security through scrubbing invalid but sensitive data pages, with the cost of impacts on storage performance due to moving out valid pages from the sanitization-required wordline, which is a logical read/write unit and consists of multiple pages in high-density SSDs. To minimize the impacts on I/O latency and data security, this paper proposes a polling-based scheduling approach for data sanitization in high-density SSDs. Our method polls a specific SSD channel for completing data sanitization at the block granularity, meanwhile other channels can still service I/O requests. Furthermore, our method assigns a low priority to the blocks that are more likely to have future <i>adjacent page</i> invalidations inside sanitization-required wordlines, while selecting the sanitization block, to minimize the negative impacts of moving valid pages. Through a series of emulation experiments on several disk traces of real-world applications, we show that our proposal can decrease the negative effects of data sanitization in terms of the risk-performance index, which is a united time metric of I/O responsiveness and the unsafe time interval, by <monospace>16.34%</monospace> on average, compared to related sanitization methods.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139376310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An LSM Tree Augmented with B+ Tree on Nonvolatile Memory 非易失性存储器上扩充B+树的LSM树
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2023-12-02 DOI: 10.1145/3633475
Donguk Kim, Jongsung Lee, Keun Soo Lim, Jun Heo, Tae Jun Ham, Jae W. Lee
{"title":"An LSM Tree Augmented with B+ Tree on Nonvolatile Memory","authors":"Donguk Kim, Jongsung Lee, Keun Soo Lim, Jun Heo, Tae Jun Ham, Jae W. Lee","doi":"10.1145/3633475","DOIUrl":"https://doi.org/10.1145/3633475","url":null,"abstract":"<p>Modern log-structured merge (LSM) tree-based key-value stores are widely used to process update-heavy workloads effectively as the LSM tree sequentializes write requests to a storage device to maximize storage performance. However, this append-only approach leaves many outdated copies of frequently updated key-value pairs, which need to be routinely cleaned up through the operation called <i>compaction</i>. When the system load is modest, compaction happens in background. However, at a high system load it can quickly become the major performance bottleneck. To address this compaction bottleneck and further improve the write throughput of LSM tree-based key-value stores, we propose LAB-DB, which augments the existing LSM tree with a pair of B<sup>+</sup> trees on byte-addressable nonvolatile memory (NVM). The auxiliary B<sup>+</sup> trees on NVM reduce both compaction frequency and compaction time, hence leading to lower compaction overhead for writes and fewer storage accesses for reads. According to our evaluation of LAB-DB on RocksDB with YCSB benchmarks, LAB-DB achieves 94% and 67% speedups on two write-intensive workloads (Workload A and F), and also a 43% geomean speedup on read-intensive YCSB Workload B, C, D, and E. This performance gain comes with a low cost of NVM whose size is just 0.6% of the entire dataset to demonstrate the scalability of LAB-DB with an ever increasing volume of future datasets.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信