I/O Transit Caching for PMem-based Block Device

arXiv - CS - Operating Systems Pub Date : 2024-03-10 DOI:arxiv-2403.06120

Qing Xu, Qisheng Jiang, Chundong Wang

{"title":"I/O Transit Caching for PMem-based Block Device","authors":"Qing Xu, Qisheng Jiang, Chundong Wang","doi":"arxiv-2403.06120","DOIUrl":null,"url":null,"abstract":"Byte-addressable non-volatile memory (NVM) sitting on the memory bus is\nemployed to make persistent memory (PMem) in general-purpose computing systems\nand embedded systems for data storage. Researchers develop software drivers\nsuch as the block translation table (BTT) to build block devices on PMem, so\nprogrammers can keep using mature and reliable conventional storage stack while\nexpecting high performance by exploiting fast PMem. However, our quantitative\nstudy shows that BTT underutilizes PMem and yields inferior performance, due to\nthe absence of the imperative in-device cache. We add a conventional I/O\nstaging cache made of DRAM space to BTT. As DRAM and PMem have comparable\naccess latency, I/O staging cache is likely to be fully filled over time.\nContinual cache evictions and fsyncs thus cause on-demand flushes with severe\nstalls, such that the I/O staging cache is concretely unappealing for\nPMem-based block devices. We accordingly propose an algorithm named Caiti with\nnovel I/O transit caching. Caiti eagerly evicts buffered data to PMem through\nCPU's multi-cores. It also conditionally bypasses a full cache and directly\nwrites data into PMem to further alleviate I/O stalls. Experiments confirm that\nCaiti significantly boosts the performance with BTT by up to 3.6x, without loss\nof block-level write atomicity.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"2016 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.06120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Byte-addressable non-volatile memory (NVM) sitting on the memory bus is employed to make persistent memory (PMem) in general-purpose computing systems and embedded systems for data storage. Researchers develop software drivers such as the block translation table (BTT) to build block devices on PMem, so programmers can keep using mature and reliable conventional storage stack while expecting high performance by exploiting fast PMem. However, our quantitative study shows that BTT underutilizes PMem and yields inferior performance, due to the absence of the imperative in-device cache. We add a conventional I/O staging cache made of DRAM space to BTT. As DRAM and PMem have comparable access latency, I/O staging cache is likely to be fully filled over time. Continual cache evictions and fsyncs thus cause on-demand flushes with severe stalls, such that the I/O staging cache is concretely unappealing for PMem-based block devices. We accordingly propose an algorithm named Caiti with novel I/O transit caching. Caiti eagerly evicts buffered data to PMem through CPU's multi-cores. It also conditionally bypasses a full cache and directly writes data into PMem to further alleviate I/O stalls. Experiments confirm that Caiti significantly boosts the performance with BTT by up to 3.6x, without loss of block-level write atomicity.

查看原文本刊更多论文

基于 PMem 的块设备的 I/O 转接缓存

在通用计算系统和嵌入式系统中，内存总线上的字节可寻址非易失性存储器（NVM）被用来制作用于数据存储的持久存储器（PMem）。研究人员开发了块转换表（BTT）等软件驱动程序，用于在 PMem 上构建块设备，这样程序员就可以继续使用成熟可靠的传统存储堆栈，同时期望通过利用快速 PMem 获得高性能。然而，我们的定量研究表明，由于缺乏必要的设备内缓存，BTT 对 PMem 的利用不足，性能较差。我们在 BTT 中添加了一个由 DRAM 空间构成的传统 I/O 暂存缓存。由于 DRAM 和 PMem 的访问延迟相当，I/O 暂存缓存很可能会随着时间的推移而被完全填满。持续的缓存驱逐和同步会导致严重的按需刷新，因此对于基于 PMem 的块设备来说，I/O 暂存缓存并不理想。因此，我们提出了一种名为 Caiti 的算法，它具有新颖的 I/O 中转缓存功能。Caiti 通过CPU 的多核急切地将缓冲数据驱逐到 PMem。它还会有条件地绕过完整缓存，直接将数据写入 PMem，以进一步缓解 I/O 阻塞。实验证实，Caiti 将 BTT 的性能显著提高了 3.6 倍，而且不会丢失块级写原子性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Operating Systems

自引率

0.00%

发文量