I/O Transit Caching for PMem-based Block Device

Qing Xu, Qisheng Jiang, Chundong Wang
{"title":"I/O Transit Caching for PMem-based Block Device","authors":"Qing Xu, Qisheng Jiang, Chundong Wang","doi":"arxiv-2403.06120","DOIUrl":null,"url":null,"abstract":"Byte-addressable non-volatile memory (NVM) sitting on the memory bus is\nemployed to make persistent memory (PMem) in general-purpose computing systems\nand embedded systems for data storage. Researchers develop software drivers\nsuch as the block translation table (BTT) to build block devices on PMem, so\nprogrammers can keep using mature and reliable conventional storage stack while\nexpecting high performance by exploiting fast PMem. However, our quantitative\nstudy shows that BTT underutilizes PMem and yields inferior performance, due to\nthe absence of the imperative in-device cache. We add a conventional I/O\nstaging cache made of DRAM space to BTT. As DRAM and PMem have comparable\naccess latency, I/O staging cache is likely to be fully filled over time.\nContinual cache evictions and fsyncs thus cause on-demand flushes with severe\nstalls, such that the I/O staging cache is concretely unappealing for\nPMem-based block devices. We accordingly propose an algorithm named Caiti with\nnovel I/O transit caching. Caiti eagerly evicts buffered data to PMem through\nCPU's multi-cores. It also conditionally bypasses a full cache and directly\nwrites data into PMem to further alleviate I/O stalls. Experiments confirm that\nCaiti significantly boosts the performance with BTT by up to 3.6x, without loss\nof block-level write atomicity.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"2016 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.06120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Byte-addressable non-volatile memory (NVM) sitting on the memory bus is employed to make persistent memory (PMem) in general-purpose computing systems and embedded systems for data storage. Researchers develop software drivers such as the block translation table (BTT) to build block devices on PMem, so programmers can keep using mature and reliable conventional storage stack while expecting high performance by exploiting fast PMem. However, our quantitative study shows that BTT underutilizes PMem and yields inferior performance, due to the absence of the imperative in-device cache. We add a conventional I/O staging cache made of DRAM space to BTT. As DRAM and PMem have comparable access latency, I/O staging cache is likely to be fully filled over time. Continual cache evictions and fsyncs thus cause on-demand flushes with severe stalls, such that the I/O staging cache is concretely unappealing for PMem-based block devices. We accordingly propose an algorithm named Caiti with novel I/O transit caching. Caiti eagerly evicts buffered data to PMem through CPU's multi-cores. It also conditionally bypasses a full cache and directly writes data into PMem to further alleviate I/O stalls. Experiments confirm that Caiti significantly boosts the performance with BTT by up to 3.6x, without loss of block-level write atomicity.
基于 PMem 的块设备的 I/O 转接缓存
在通用计算系统和嵌入式系统中,内存总线上的字节可寻址非易失性存储器(NVM)被用来制作用于数据存储的持久存储器(PMem)。研究人员开发了块转换表(BTT)等软件驱动程序,用于在 PMem 上构建块设备,这样程序员就可以继续使用成熟可靠的传统存储堆栈,同时期望通过利用快速 PMem 获得高性能。 然而,我们的定量研究表明,由于缺乏必要的设备内缓存,BTT 对 PMem 的利用不足,性能较差。我们在 BTT 中添加了一个由 DRAM 空间构成的传统 I/O 暂存缓存。由于 DRAM 和 PMem 的访问延迟相当,I/O 暂存缓存很可能会随着时间的推移而被完全填满。持续的缓存驱逐和同步会导致严重的按需刷新,因此对于基于 PMem 的块设备来说,I/O 暂存缓存并不理想。因此,我们提出了一种名为 Caiti 的算法,它具有新颖的 I/O 中转缓存功能。Caiti 通过CPU 的多核急切地将缓冲数据驱逐到 PMem。它还会有条件地绕过完整缓存,直接将数据写入 PMem,以进一步缓解 I/O 阻塞。实验证实,Caiti 将 BTT 的性能显著提高了 3.6 倍,而且不会丢失块级写原子性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信