ACM Transactions on Storage最新文献_第10页

Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems Oasis:对象存储系统扩展中的数据迁移控制

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568424

Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu

{"title":"Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems","authors":"Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu","doi":"10.1145/3568424","DOIUrl":"https://doi.org/10.1145/3568424","url":null,"abstract":"Object-based storage systems have been widely used for various scenarios such as file storage, block storage, blob (e.g., large videos) storage, and so on, where the data is placed among a large number of object storage devices (OSDs). Data placement is critical for the scalability of decentralized object-based storage systems. The state-of-the-art CRUSH placement method is a decentralized algorithm that deterministically places object replicas onto storage devices without relying on a central directory. While enjoying the benefits of decentralization such as high scalability, robustness, and performance, CRUSH-based storage systems suffer from uncontrolled data migration when expanding the capacity of the storage clusters (i.e., adding new OSDs), which is determined by the nature of CRUSH and will cause significant performance degradation when the expansion is nontrivial. This article presents MapX, a novel extension to CRUSH that uses an extra time-dimension mapping (from object creation times to cluster expansion times) for controlling data migration after cluster expansions. Each expansion is viewed as a new layer of the CRUSH map represented by a virtual node beneath the CRUSH root. MapX controls the mapping from objects onto layers by manipulating the timestamps of the intermediate placement groups (PGs). MapX is applicable to a large variety of object-based storage scenarios where object timestamps can be maintained as higher-level metadata. We have applied MapX to the state-of-the-art Ceph-RBD (RADOS Block Device) to implement a migration-controllable, decentralized object-based block store (called Oasis). Oasis extends the RBD metadata structure to maintain and retrieve approximate object creation times (for migration control) at the granularity of expansion layers. Experimental results show that the MapX-based Oasis block store outperforms the CRUSH-based Ceph-RBD (which is busy in migrating objects after expansions) by 3.17× ∼ 4.31× in tail latency, and 76.3% (respectively, 83.8%) in IOPS for reads (respectively, writes).","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"19 1","pages":"1 - 22"},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44156164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The what, The from, and The to: The Migration Games in Deduplicated Systems 什么，从，到:重复数据删除系统中的迁移游戏

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-15 DOI: https://dl.acm.org/doi/10.1145/3565025

Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar

{"title":"The what, The from, and The to: The Migration Games in Deduplicated Systems","authors":"Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar","doi":"https://dl.acm.org/doi/10.1145/3565025","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3565025","url":null,"abstract":"Deduplication reduces the size of the data stored in large-scale storage systems by replacing duplicate data blocks with references to their unique copies. This creates dependencies between files that contain similar content and complicates the management of data in the system. In this article, we address the problem of data migration, in which files are remapped between different volumes as a result of system expansion or maintenance. The challenge of determining which files and blocks to migrate has been studied extensively for systems without deduplication. In the context of deduplicated storage, however, only simplified migration scenarios have been considered.In this article, we formulate the general migration problem for deduplicated systems as an optimization problem whose objective is to minimize the system’s size while ensuring that the storage load is evenly distributed between the system’s volumes and that the network traffic required for the migration does not exceed its allocation.We then present three algorithms for generating effective migration plans, each based on a different approach and representing a different trade-off between computation time and migration efficiency. Our greedy algorithm provides modest space savings but is appealing thanks to its exceptionally short runtime. Its results can be improved by using larger system representations. Our theoretically optimal algorithm formulates the migration problem as an integer linear programming (ILP) instance. Its migration plans consistently result in smaller and more balanced systems than those of the greedy approach, although its runtime is long and, as a result, the theoretical optimum is not always found. Our clustering algorithm enjoys the best of both worlds: its migration plans are comparable to those generated by the ILP-based algorithm, but its runtime is shorter, sometimes by an order of magnitude. It can be further accelerated at a modest cost in the quality of its results.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"44 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving the Endurance of Next Generation SSD’s using WOM-v Codes 使用WOM-v码提高下一代SSD的耐久性

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-14 DOI: 10.1145/3565027

Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder

{"title":"Improving the Endurance of Next Generation SSD’s using WOM-v Codes","authors":"Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder","doi":"10.1145/3565027","DOIUrl":"https://doi.org/10.1145/3565027","url":null,"abstract":"High density Solid State Drives, such as QLC drives, offer increased storage capacity, but a magnitude lower Program and Erase (P/E) cycles, limiting their endurance and hence usability. We present the design and implementation of non-binary, Voltage-Based Write-Once-Memory (WOM-v) Codes to improve the lifetime of QLC drives. First, we develop a FEMU based simulator test-bed to evaluate the gains of WOM-v codes on real world workloads. Second, we propose and implement two optimizations, an efficient garbage collection mechanism and an encoding optimization to drastically improve WOM-v code endurance without compromising performance. Third, we propose analytical approaches to obtain estimates of the endurance gains under WOM-v codes. We analyze the Greedy garbage collection technique with uniform page access distribution and the Least Recently Written (LRW) garbage collection technique with skewed page access distribution in the context of WOM-v codes. We find that although both approaches overestimate the number of required erase operations, the model based on greedy garbage collection with uniform page access distribution provides tighter bounds. A careful evaluation, including microbenchmarks and trace-driven evaluation, demonstrates that WOM-v codes can reduce Erase cycles for QLC drives by 4.4×–11.1× for real world workloads with minimal performance overheads resulting in improved QLC SSD lifetime.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"18 1","pages":"1 - 32"},"PeriodicalIF":1.7,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43627797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Disk Failure Prediction Method Based on Active Semi-supervised Learning 基于主动半监督学习的磁盘故障预测方法

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3523699

Yang Zhou, Fang Wang, Dan Feng

{"title":"A Disk Failure Prediction Method Based on Active Semi-supervised Learning","authors":"Yang Zhou, Fang Wang, Dan Feng","doi":"https://dl.acm.org/doi/10.1145/3523699","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3523699","url":null,"abstract":"Disk failure has always been a major problem for data centers, leading to data loss. Current disk failure prediction approaches are mostly offline and assume that the disk labels required for training learning models are available and accurate. However, these offline methods are no longer suitable for disk failure prediction tasks in large-scale data centers. Behind this explosive amount of data, most methods do not consider whether it is not easy to get the label values during the training or the obtained label values are not completely accurate. These problems further restrict the development of supervised learning and offline modeling in disk failure prediction. In this article, Active Semi-supervised Learning Disk-failure Prediction (ASLDP), a novel disk failure prediction method is proposed, which uses active learning and semi-supervised learning. According to the characteristics of data in the disk lifecycle, ASLDP carries out active learning for those clear labeled samples, which selects valuable samples with the most significant probability uncertainty and eliminates redundancy. For those samples that are unclearly labeled or unlabeled, ASLDP uses semi-supervised learning for pre-labeled by calculating the conditional values of the samples and enhances the generalization ability by active learning. Compared with several state-of-the-art offline and online learning approaches, the results on four realistic datasets from Backblaze and Baidu demonstrate that ASLDP achieves stable failure detection rates of 80–85% with low false alarm rates. In addition, we use a dataset from Alibaba to evaluate the generality of ASLDP. Furthermore, ASLDP can overcome the problem of missing sample labels and data redundancy in large data centers, which are not considered and implemented in all offline learning methods for disk failure prediction to the best of our knowledge. Finally, ASLDP can predict the disk failure 4.9 days in advance with lower overhead and latency.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"69 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage 领域:自适应、可重构、擦除编码、原子存储

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3510613

Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch

{"title":"Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage","authors":"Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch","doi":"https://dl.acm.org/doi/10.1145/3510613","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510613","url":null,"abstract":"Emulating a shared atomic, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called Ares, which allows the set of hosts to be modified in the course of an execution. Ares is composed of three main components: (i) a reconfiguration protocol, (ii) a read/write protocol, and (iii) a set of data access primitives (DAPs). The design of Ares is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the Ares algorithm.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"4320 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138542298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tunable Encrypted Deduplication with Attack-resilient Key Management 可调加密重复数据删除与攻击弹性密钥管理

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-11 DOI: https://dl.acm.org/doi/10.1145/3510614

Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee

{"title":"Tunable Encrypted Deduplication with Attack-resilient Key Management","authors":"Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee","doi":"https://dl.acm.org/doi/10.1145/3510614","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510614","url":null,"abstract":"Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present <sans-serif>TED</sans-serif>, a tunable encrypted deduplication primitive that provides a tunable mechanism for balancing the tradeoff between storage efficiency and data confidentiality. The core idea of <sans-serif>TED</sans-serif> is that its key derivation is based on not only the chunk content but also the number of duplicate chunk copies, such that duplicate chunks are encrypted by distinct keys in a controlled manner. In particular, <sans-serif>TED</sans-serif> allows users to configure a storage blowup factor, under which the information leakage quantified by an information-theoretic measure is minimized for any input workload. In addition, we extend <sans-serif>TED</sans-serif> with a distributed key management architecture and propose two attack-resilient key generation schemes that trade between performance and fault tolerance. We implement an encrypted deduplication prototype <sans-serif>TEDStore</sans-serif> to realize <sans-serif>TED</sans-serif> in networked environments. Evaluation on real-world file system snapshots shows that <sans-serif>TED</sans-serif> effectively balances the tradeoff between storage efficiency and data confidentiality, with small performance overhead.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"68 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward Fast and Scalable Random Walks over Disk-Resident Graphs via Efficient I/O Management 通过高效I/O管理实现磁盘驻留图的快速可伸缩随机漫步

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-11 DOI: https://dl.acm.org/doi/10.1145/3533579

Rui Wang, Yongkun Li, Yinlong Xu, Hong Xie, John C. S. Lui, Shuibing He

{"title":"Toward Fast and Scalable Random Walks over Disk-Resident Graphs via Efficient I/O Management","authors":"Rui Wang, Yongkun Li, Yinlong Xu, Hong Xie, John C. S. Lui, Shuibing He","doi":"https://dl.acm.org/doi/10.1145/3533579","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3533579","url":null,"abstract":"Traditional graph systems mainly use the iteration-based model, which iteratively loads graph blocks into memory for analysis so as to reduce random I/Os. However, this iteration-based model limits the efficiency and scalability of running random walk, which is a fundamental technique to analyze large graphs. In this article, we first propose a state-aware I/O model to improve the I/O efficiency of running random walk, then we develop a block-centric indexing and buffering scheme for managing walk data, and leverage an asynchronous walk updating strategy to improve random walk efficiency. We implement an I/O-efficient graph system, <sans-serif>GraphWalker</sans-serif>, which is efficient to handle very large disk-resident graphs and also scalable to run tens of billions of random walks with only a single commodity machine. Experiments show that <sans-serif>GraphWalker</sans-serif> can achieve more than an order of magnitude speedup when compared with DrunkardMob, which is tailored for random walks based on the classical graph system GraphChi, as well as two state-of-the-art single-machine graph systems, Graphene and GraFSoft. Furthermore, when compared with the most recent distributed system KnightKing, <sans-serif>GraphWalker</sans-serif> still achieves comparable performance with only a single machine, thereby making it a more cost-effective alternative.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"68 7","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction to the Special Section on USENIX FAST 2022 USENIX FAST 2022专题介绍

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-11 DOI: 10.1145/3564770

Hildebrand Dean, Donald Porter

引用次数: 0

Introduction to the Special Section on USENIX FAST 2022 介绍USENIX FAST 2022的特殊部分

IF 1.7 3区计算机科学

ACM Transactions on Storage Pub Date : 2022-11-11 DOI: https://dl.acm.org/doi/10.1145/3564770

Hildebrand Dean, Donald Porter

引用次数: 0

ctFS: Replacing File Indexing with Hardware Memory Translation through Contiguous File Allocation for Persistent Memory ctFS:通过为持久内存分配连续文件，用硬件内存转换代替文件索引