RGKV: A GPGPU-Empowered Compaction Framework for LSM-Tree-Based KV Stores With Optimized Data Transfer and Parallel Processing

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers Pub Date : 2025-01-29 DOI:10.1109/TC.2025.3535832

Hui Sun;Xiangxiang Jiang;Yinliang Yue;Xiao Qin

{"title":"RGKV: A GPGPU-Empowered Compaction Framework for LSM-Tree-Based KV Stores With Optimized Data Transfer and Parallel Processing","authors":"Hui Sun;Xiangxiang Jiang;Yinliang Yue;Xiao Qin","doi":"10.1109/TC.2025.3535832","DOIUrl":null,"url":null,"abstract":"The Log-structured merge-tree (LSM-tree), widely adopted in key-value stores (KV stores), is esteemed for its efficient write performance and superb scalability amid large-scale data processing. The compaction process of LSM-trees consumes significant computational resources, thereby becoming a bottleneck for system performance. Traditionally, compaction is handled by CPUs, but CPU processing capacity often falls short of increasing demands with the surge in data volumes. To address this challenge, existing solutions attempt to accelerate compaction using GPGPUs. Due to low GPGPU parallelism and data transfer delay in prior studies, the anticipated performance improvements have not yet been fully realized. In this paper, we bring forth RGKV – a comprehensive optimization approach to overcoming the limitations of current GPGPU-empowered KV stores. RGKV features the GPGPU-adapted contiguous memory allocation and GPGPU-optimized key-value block architecture to furnish high-efficient GPGPU parallel encoding and decoding catering to the needs of KV stores. To enhance the computational efficiency and overall performance of KV stores, RGKV employs a parallel merge-sorting algorithm to maximize the parallel processing capabilities of the GPGPU. Moreover, RGKV incorporates a data transfer module anchored on the GPUDirect storage technology – designed for KV stores – and designs an efficient data structure to substantially curtail data transfer latency between an SSD and a GPGPU, boosting data transfer speed and alleviating CPU load. The experimental results demonstrate that RGKV achieves a remarkable 4<inline-formula><tex-math>$\\times$</tex-math></inline-formula> improvement in overall throughput and a 7<inline-formula><tex-math>$\\times$</tex-math></inline-formula> improvement in compaction throughput compared to the state-of-the-art KV stores, while also reducing average write latency by 70.6%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 5","pages":"1605-1619"},"PeriodicalIF":3.6000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857621/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The Log-structured merge-tree (LSM-tree), widely adopted in key-value stores (KV stores), is esteemed for its efficient write performance and superb scalability amid large-scale data processing. The compaction process of LSM-trees consumes significant computational resources, thereby becoming a bottleneck for system performance. Traditionally, compaction is handled by CPUs, but CPU processing capacity often falls short of increasing demands with the surge in data volumes. To address this challenge, existing solutions attempt to accelerate compaction using GPGPUs. Due to low GPGPU parallelism and data transfer delay in prior studies, the anticipated performance improvements have not yet been fully realized. In this paper, we bring forth RGKV – a comprehensive optimization approach to overcoming the limitations of current GPGPU-empowered KV stores. RGKV features the GPGPU-adapted contiguous memory allocation and GPGPU-optimized key-value block architecture to furnish high-efficient GPGPU parallel encoding and decoding catering to the needs of KV stores. To enhance the computational efficiency and overall performance of KV stores, RGKV employs a parallel merge-sorting algorithm to maximize the parallel processing capabilities of the GPGPU. Moreover, RGKV incorporates a data transfer module anchored on the GPUDirect storage technology – designed for KV stores – and designs an efficient data structure to substantially curtail data transfer latency between an SSD and a GPGPU, boosting data transfer speed and alleviating CPU load. The experimental results demonstrate that RGKV achieves a remarkable 4

$\times$

improvement in overall throughput and a 7

$\times$

improvement in compaction throughput compared to the state-of-the-art KV stores, while also reducing average write latency by 70.6%.

查看原文本刊更多论文

RGKV：基于gpgpu的lsm树存储压缩框架，优化数据传输和并行处理

日志结构的合并树（LSM-tree）在键值存储（KV）存储中被广泛采用，在大规模数据处理中具有高效的写入性能和极佳的可扩展性。lsm树的压缩过程消耗了大量的计算资源，成为制约系统性能的瓶颈。传统上，压缩是由CPU处理的，但是随着数据量的激增，CPU的处理能力往往无法满足日益增长的需求。为了应对这一挑战，现有的解决方案尝试使用gpgpu来加速压缩。由于先前的研究中GPGPU的并行性和数据传输延迟较低，预期的性能改进尚未完全实现。在本文中，我们提出了RGKV -一种综合优化方法，以克服当前gpgpu授权KV存储的局限性。RGKV采用适应GPGPU的连续内存分配和GPGPU优化的键值块架构，提供满足KV存储需求的高效GPGPU并行编解码。为了提高KV存储的计算效率和整体性能，RGKV采用并行合并排序算法，最大限度地提高GPGPU的并行处理能力。此外，RGKV集成了一个数据传输模块，该模块固定在GPUDirect存储技术上（专为KV存储而设计），并设计了一个高效的数据结构，大大减少了SSD和GPGPU之间的数据传输延迟，提高了数据传输速度，减轻了CPU负载。实验结果表明，与最先进的KV存储相比，RGKV的总体吞吐量提高了4倍，压缩吞吐量提高了7倍，同时平均写延迟降低了70.6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.