{"title":"I/O Transit Caching for PMem-based Block Device","authors":"Qing Xu, Qisheng Jiang, Chundong Wang","doi":"arxiv-2403.06120","DOIUrl":null,"url":null,"abstract":"Byte-addressable non-volatile memory (NVM) sitting on the memory bus is\nemployed to make persistent memory (PMem) in general-purpose computing systems\nand embedded systems for data storage. Researchers develop software drivers\nsuch as the block translation table (BTT) to build block devices on PMem, so\nprogrammers can keep using mature and reliable conventional storage stack while\nexpecting high performance by exploiting fast PMem. However, our quantitative\nstudy shows that BTT underutilizes PMem and yields inferior performance, due to\nthe absence of the imperative in-device cache. We add a conventional I/O\nstaging cache made of DRAM space to BTT. As DRAM and PMem have comparable\naccess latency, I/O staging cache is likely to be fully filled over time.\nContinual cache evictions and fsyncs thus cause on-demand flushes with severe\nstalls, such that the I/O staging cache is concretely unappealing for\nPMem-based block devices. We accordingly propose an algorithm named Caiti with\nnovel I/O transit caching. Caiti eagerly evicts buffered data to PMem through\nCPU's multi-cores. It also conditionally bypasses a full cache and directly\nwrites data into PMem to further alleviate I/O stalls. Experiments confirm that\nCaiti significantly boosts the performance with BTT by up to 3.6x, without loss\nof block-level write atomicity.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"2016 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.06120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Byte-addressable non-volatile memory (NVM) sitting on the memory bus is
employed to make persistent memory (PMem) in general-purpose computing systems
and embedded systems for data storage. Researchers develop software drivers
such as the block translation table (BTT) to build block devices on PMem, so
programmers can keep using mature and reliable conventional storage stack while
expecting high performance by exploiting fast PMem. However, our quantitative
study shows that BTT underutilizes PMem and yields inferior performance, due to
the absence of the imperative in-device cache. We add a conventional I/O
staging cache made of DRAM space to BTT. As DRAM and PMem have comparable
access latency, I/O staging cache is likely to be fully filled over time.
Continual cache evictions and fsyncs thus cause on-demand flushes with severe
stalls, such that the I/O staging cache is concretely unappealing for
PMem-based block devices. We accordingly propose an algorithm named Caiti with
novel I/O transit caching. Caiti eagerly evicts buffered data to PMem through
CPU's multi-cores. It also conditionally bypasses a full cache and directly
writes data into PMem to further alleviate I/O stalls. Experiments confirm that
Caiti significantly boosts the performance with BTT by up to 3.6x, without loss
of block-level write atomicity.