Copy-on-Abundant-Write for Nimble File System Clones

ACM Transactions on Storage (TOS) Pub Date : 2021-01-29 DOI:10.1145/3423495

Yang Zhan, Alex Conway, Yizheng Jiao, Nirjhar Mukherjee, Ian Groombridge, M. A. Bender, Martín Farach-Colton, William Jannen, Rob Johnson, Donald E. Porter, Jun Yuan

{"title":"Copy-on-Abundant-Write for Nimble File System Clones","authors":"Yang Zhan, Alex Conway, Yizheng Jiao, Nirjhar Mukherjee, Ian Groombridge, M. A. Bender, Martín Farach-Colton, William Jannen, Rob Johnson, Donald E. Porter, Jun Yuan","doi":"10.1145/3423495","DOIUrl":null,"url":null,"abstract":"Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This article describes nimble clones in B-ε-tree File System (BetrFS), an open-source, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copy-on-write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, such as a Bε-tree or an log-structured merge-tree (LSM)-tree, can decouple the logical application of updates from the granularity at which data is physically copied. In our write-optimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized Linux Containers (LXC) backends by 3--4×.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3423495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This article describes nimble clones in B-ε-tree File System (BetrFS), an open-source, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copy-on-write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, such as a Bε-tree or an log-structured merge-tree (LSM)-tree, can decouple the logical application of updates from the granularity at which data is physically copied. In our write-optimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized Linux Containers (LXC) backends by 3--4×.

查看原文本刊更多论文

灵活的文件系统克隆的copy -on- abundance - write

制作文件和目录的逻辑副本或克隆对于许多实际应用程序和工作流(包括备份、虚拟机和容器)至关重要。理想的克隆实现应满足以下性能目标:(1)创建克隆具有低延迟;(2)所有版本的读取速度都很快(即即使经过修改，也始终保持空间局部性);(3)所有版本的写入速度都很快;(4)整体系统空间高效。实现实现所有四个属性的克隆操作，我们称之为灵活的克隆，是一个长期存在的开放性问题。本文描述了B-ε-tree文件系统(BetrFS)中的灵活克隆，这是一个开源的、全路径索引的、写优化的文件系统。我们工作背后的关键观察是，标准的“写时复制”启发式方法可能过于粗糙，无法提高空间效率，或者过于细粒度，无法保留局域性。另一方面，写优化的键值存储，例如bh树或日志结构的合并树(LSM)树，可以将更新的逻辑应用程序与物理复制数据的粒度解耦。在我们的写优化克隆实现中，只有当克隆发生了足够大的变化，需要进行复制时，克隆之间的数据共享才会中断，我们将这种策略称为copy-on- abundance -write。我们证明了批处理和分摊BetrFS克隆操作成本所需的算法工作不会削弱基线BetrFS的性能优势;在少数情况下，BetrFS的性能甚至有所提高。BetrFS克隆效率高;例如，当使用克隆操作创建容器时，BetrFS的性能比简单的递归复制高出两个数量级，比具有专用Linux容器(LXC)后端的文件系统高出3—4倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量