A+Store: An Asynchronous Parallel Compaction for Multi-NDP-Enabled Key–Value Store

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Systems Architecture Pub Date : 2025-08-21 DOI:10.1016/j.sysarc.2025.103549

Hui Sun , Bo Chen , Jiaming Huang , Qiang Wang , Xiaole Liu , Yi Zhou , Yinliang Yue , Xiao Qin

{"title":"A+Store: An Asynchronous Parallel Compaction for Multi-NDP-Enabled Key–Value Store","authors":"Hui Sun , Bo Chen , Jiaming Huang , Qiang Wang , Xiaole Liu , Yi Zhou , Yinliang Yue , Xiao Qin","doi":"10.1016/j.sysarc.2025.103549","DOIUrl":null,"url":null,"abstract":"<div><div>LSM-tree-based key–value stores face significant I/O bandwidth consumption and performance bottlenecks due to frequent data rewrites and migrations during compaction. To address this issue, near-data processing (NDP) technology has emerged as a promising solution and is gaining increasing attention. NDP reduces the data transfer distance between storage and processing resources by placing computational resources closer to storage devices or integrating them into memory, thereby effectively alleviating performance bottlenecks. However, existing multi-NDP key–value stores still face synchronization problems, leading to long wait times and underutilization of resources. To address these issues, we propose an asynchronous parallel compaction for multi-NDP-enabled key–value store named A<math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math>Store. In A<math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math>Store, to optimize data layout, we implement an MLSM-tree on each NDP device, an asynchronous execution queue for dynamic task management, and an independent metadata management method. This asynchronous mechanism allows each NDP device to update its metadata immediately after completing a compaction task rather than wait for other devices, thereby eliminating synchronization waiting time among NDP devices. Additionally, as each NDP stores SSTables really within specific key ranges; thus, the device can perform sub-compaction tasks in parallel according to its key range, significantly enhancing the execution speed of tasks within each NDP device. This approach can improve the system’s parallel processing capability and resource utilization, addressing the bottlenecks in existing multi-NDP KV stores in applications with the requirements of large-scale data processing and low latency. To evaluate the performance of A+Store, we compare A+Store against state-of-the-art KV stores, including PStore, MStore, and RocksDB (configured with a RAID architecture). We develop a tested toolkit using the real-world dataset OpenAlex, and study the performance of A+Store under realistic workloads. Experimental results show that A+Store demonstrates superior performance across all tests. For example, when loading 100 GB of writes, A+Store achieves 2.87<math><mo>×</mo></math> the throughput of PStore and 2<math><mo>×</mo></math> that of MStore, while reducing write amplification by 65.3% and 24.8% compared to PStore and MStore – NDP-empowered KV stores, respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103549"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125002218","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

LSM-tree-based key–value stores face significant I/O bandwidth consumption and performance bottlenecks due to frequent data rewrites and migrations during compaction. To address this issue, near-data processing (NDP) technology has emerged as a promising solution and is gaining increasing attention. NDP reduces the data transfer distance between storage and processing resources by placing computational resources closer to storage devices or integrating them into memory, thereby effectively alleviating performance bottlenecks. However, existing multi-NDP key–value stores still face synchronization problems, leading to long wait times and underutilization of resources. To address these issues, we propose an asynchronous parallel compaction for multi-NDP-enabled key–value store named A

^{+}

Store. In A

^{+}

Store, to optimize data layout, we implement an MLSM-tree on each NDP device, an asynchronous execution queue for dynamic task management, and an independent metadata management method. This asynchronous mechanism allows each NDP device to update its metadata immediately after completing a compaction task rather than wait for other devices, thereby eliminating synchronization waiting time among NDP devices. Additionally, as each NDP stores SSTables really within specific key ranges; thus, the device can perform sub-compaction tasks in parallel according to its key range, significantly enhancing the execution speed of tasks within each NDP device. This approach can improve the system’s parallel processing capability and resource utilization, addressing the bottlenecks in existing multi-NDP KV stores in applications with the requirements of large-scale data processing and low latency. To evaluate the performance of A⁺Store, we compare A⁺Store against state-of-the-art KV stores, including PStore, MStore, and RocksDB (configured with a RAID architecture). We develop a tested toolkit using the real-world dataset OpenAlex, and study the performance of A⁺Store under realistic workloads. Experimental results show that A⁺Store demonstrates superior performance across all tests. For example, when loading 100 GB of writes, A⁺Store achieves 2.87

\times

the throughput of PStore and 2

\times

that of MStore, while reducing write amplification by 65.3% and 24.8% compared to PStore and MStore – NDP-empowered KV stores, respectively.

查看原文本刊更多论文

A+Store：支持多ndp的键值存储异步并行压缩

由于压缩过程中频繁的数据重写和迁移，基于lsm树的键值存储面临严重的I/O带宽消耗和性能瓶颈。为了解决这个问题，近数据处理（NDP）技术作为一种很有前途的解决方案已经出现，并越来越受到关注。NDP通过将计算资源放置在离存储设备更近的地方，或者将计算资源集成到内存中，缩短了存储资源和处理资源之间的数据传输距离，有效缓解了性能瓶颈。然而，现有的多ndp键值存储仍然面临同步问题，导致等待时间过长和资源利用率不足。为了解决这些问题，我们提出了一种用于支持多ndp的键值存储的异步并行压缩，称为A+ store。在A+Store中，为了优化数据布局，我们在每个NDP设备上实现了mlsm树，异步执行队列用于动态任务管理，以及独立的元数据管理方法。这种异步机制允许每个NDP设备在完成压缩任务后立即更新其元数据，而不必等待其他设备，从而消除了NDP设备之间的同步等待时间。此外，由于每个NDP都将sstable存储在特定的键范围内；因此，设备可以根据其键范围并行执行子压缩任务，大大提高了每个NDP设备内任务的执行速度。该方法可以提高系统的并行处理能力和资源利用率，解决现有多ndp千伏存储在大规模数据处理和低时延应用中的瓶颈问题。为了评估A+Store的性能，我们将A+Store与最先进的KV Store进行了比较，包括PStore、MStore和RocksDB（配置了RAID架构）。我们使用真实的数据集OpenAlex开发了一个经过测试的工具包，并研究了a +Store在实际工作负载下的性能。实验结果表明，A+Store在所有测试中都表现出优异的性能。例如，当加载100gb的写时，A+Store的吞吐量是PStore的2.87倍，是MStore的2倍，而写放大比PStore和MStore - ndp驱动的KV Store分别降低了65.3%和24.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Systems Architecture 工程技术-计算机：硬件

CiteScore

8.70

自引率

15.60%

发文量

226

审稿时长

46 days

期刊介绍： The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.