{"title":"FlexBSO: Flexible Block Storage Offload for Datacenters","authors":"Vojtech Aschenbrenner, John Shawger, Sadman Sakib","doi":"arxiv-2409.02381","DOIUrl":null,"url":null,"abstract":"Efficient virtualization of CPU and memory is standardized and mature.\nCapabilities such as Intel VT-x [3] have been added by manufacturers for\nefficient hypervisor support. In contrast, virtualization of a block device and\nits presentation to the virtual machines on the host can be done in multiple\nways. Indeed, hyperscalers develop in-house solutions to improve performance\nand cost-efficiency of their storage solutions for datacenters. Unfortunately,\nthese storage solutions are based on specialized hardware and software which\nare not publicly available. The traditional solution is to expose virtual block\ndevice to the VM through a paravirtualized driver like virtio [2]. virtio\nprovides significantly better performance than real block device driver\nemulation because of host OS and guest OS cooperation. The IO requests are then\nfulfilled by the host OS either with a local block device such as an SSD drive\nor with some form of disaggregated storage over the network like NVMe-oF or\niSCSI. There are three main problems to the traditional solution. 1) Cost. IO\noperations consume host CPU cycles due to host OS involvement. These CPU cycles\nare doing useless work from the application point of view. 2) Inflexibility.\nAny change of the virtualized storage stack requires host OS and/or guest OS\ncooperation and cannot be done silently in production. 3) Performance. IO\noperations are causing recurring VM EXITs to do the transition from non-root\nmode to root mode on the host CPU. This results into excessive IO performance\nimpact. We propose FlexBSO, a hardware-assisted solution, which solves all the\nmentioned issues. Our prototype is based on the publicly available Bluefield-2\nSmartNIC with NVIDIA SNAP support, hence can be deployed without any obstacles.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Efficient virtualization of CPU and memory is standardized and mature.
Capabilities such as Intel VT-x [3] have been added by manufacturers for
efficient hypervisor support. In contrast, virtualization of a block device and
its presentation to the virtual machines on the host can be done in multiple
ways. Indeed, hyperscalers develop in-house solutions to improve performance
and cost-efficiency of their storage solutions for datacenters. Unfortunately,
these storage solutions are based on specialized hardware and software which
are not publicly available. The traditional solution is to expose virtual block
device to the VM through a paravirtualized driver like virtio [2]. virtio
provides significantly better performance than real block device driver
emulation because of host OS and guest OS cooperation. The IO requests are then
fulfilled by the host OS either with a local block device such as an SSD drive
or with some form of disaggregated storage over the network like NVMe-oF or
iSCSI. There are three main problems to the traditional solution. 1) Cost. IO
operations consume host CPU cycles due to host OS involvement. These CPU cycles
are doing useless work from the application point of view. 2) Inflexibility.
Any change of the virtualized storage stack requires host OS and/or guest OS
cooperation and cannot be done silently in production. 3) Performance. IO
operations are causing recurring VM EXITs to do the transition from non-root
mode to root mode on the host CPU. This results into excessive IO performance
impact. We propose FlexBSO, a hardware-assisted solution, which solves all the
mentioned issues. Our prototype is based on the publicly available Bluefield-2
SmartNIC with NVIDIA SNAP support, hence can be deployed without any obstacles.