ACM Transactions on Storage最新文献

筛选
英文 中文
Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors 具有潜在错误的擦除编码存储系统的可靠性评估
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568313
I. Iliadis
{"title":"Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors","authors":"I. Iliadis","doi":"10.1145/3568313","DOIUrl":"https://doi.org/10.1145/3568313","url":null,"abstract":"Large-scale storage systems employ erasure-coding redundancy schemes to protect against device failures. The adverse effect of latent sector errors on the Mean Time to Data Loss (MTTDL) and the Expected Annual Fraction of Data Loss (EAFDL) reliability metrics is evaluated. A theoretical model capturing the effect of latent errors and device failures is developed, and closed-form expressions for the metrics of interest are derived. The MTTDL and EAFDL of erasure-coded systems are obtained analytically for (i) the entire range of bit error rates; (ii) the symmetric, clustered, and declustered data placement schemes; and (iii) arbitrary device failure and rebuild time distributions under network rebuild bandwidth constraints. The range of error rates that deteriorate system reliability is derived analytically. For realistic values of sector error rates, the results obtained demonstrate that MTTDL degrades, whereas, for moderate erasure codes, EAFDL remains practically unaffected. It is demonstrated that, in the range of typical sector error rates and for very powerful erasure codes, EAFDL degrades as well. It is also shown that the declustered data placement scheme offers superior reliability.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46039172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Crash Consistency for NVMe over PCIe and RDMA NVMe在PCIe和RDMA上的高效崩溃一致性
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568428
Xiaojian Liao, Youyou Lu, Zhe Yang, J. Shu
{"title":"Efficient Crash Consistency for NVMe over PCIe and RDMA","authors":"Xiaojian Liao, Youyou Lu, Zhe Yang, J. Shu","doi":"10.1145/3568428","DOIUrl":"https://doi.org/10.1145/3568428","url":null,"abstract":"This article presents crash-consistent Non-Volatile Memory Express (ccNVMe), a novel extension of the NVMe that defines how host software communicates with the non-volatile memory (e.g., solid-state drive) across a PCI Express bus and RDMA-capable networks with both crash consistency and performance efficiency. Existing storage systems pay a huge tax on crash consistency, and thus cannot fully exploit the multi-queue parallelism and low latency of the NVMe and RDMA interfaces. ccNVMe alleviates this major bottleneck by coupling the crash consistency to the data dissemination. This new idea allows the storage system to achieve crash consistency by taking the free rides of the data dissemination mechanism of NVMe, using only two lightweight memory-mapped I/Os (MMIOs), unlike traditional systems that use complex update protocol and synchronized block I/Os. ccNVMe introduces a series of techniques including transaction-aware MMIO/doorbell and I/O command coalescing to reduce the PCIe traffic as well as to provide atomicity. We present how to build a high-performance and crash-consistent file system named MQFS atop ccNVMe. We experimentally show that MQFS increases the IOPS of RocksDB by 36% and 28% compared to a state-of-the-art file system and Ext4 without journaling, respectively.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41673946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
End-to-end I/O Monitoring on Leading Supercomputers 领先超级计算机上的端到端I/O监控
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568425
Bin Yang, W. Xue, Tianyu Zhang, Shichao Liu, Xiaosong Ma, Xiyang Wang, Weiguo Liu
{"title":"End-to-end I/O Monitoring on Leading Supercomputers","authors":"Bin Yang, W. Xue, Tianyu Zhang, Shichao Liu, Xiaosong Ma, Xiyang Wang, Weiguo Liu","doi":"10.1145/3568425","DOIUrl":"https://doi.org/10.1145/3568425","url":null,"abstract":"This paper offers a solution to overcome the complexities of production system I/O performance monitoring. We present Beacon, an end-to-end I/O resource monitoring and diagnosis system for the 40960-node Sunway TaihuLight supercomputer, currently the fourth-ranked supercomputer in the world. Beacon simultaneously collects and correlates I/O tracing/profiling data from all the compute nodes, forwarding nodes, storage nodes, and metadata servers. With mechanisms such as aggressive online and offline trace compression and distributed caching/storage, it delivers scalable, low-overhead, and sustainable I/O diagnosis under production use. With Beacon’s deployment on TaihuLight for more than three years, we demonstrate Beacon’s effectiveness with real-world use cases for I/O performance issue identification and diagnosis. It has already successfully helped center administrators identify obscure design or configuration flaws, system anomaly occurrences, I/O performance interference, and resource under- or over-provisioning problems. Several of the exposed problems have already been fixed, with others being currently addressed. Encouraged by Beacon’s success in I/O monitoring, we extend it to monitor interconnection networks, which is another contention point on supercomputers. In addition, we demonstrate Beacon’s generality by extending it to other supercomputers. Both Beacon codes and part of collected monitoring data are released.1","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41618883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Extending and Programming the NVMe I/O Determinism Interface for Flash Arrays 闪存阵列NVMe I/O确定性接口的扩展与编程
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568427
Huaicheng Li, Martin L. Putra, Ronald Shi, Fadhil I. Kurnia, Xing Lin, Jaeyoung Do, A. I. Kistijantoro, G. Ganger, Haryadi S. Gunawi
{"title":"Extending and Programming the NVMe I/O Determinism Interface for Flash Arrays","authors":"Huaicheng Li, Martin L. Putra, Ronald Shi, Fadhil I. Kurnia, Xing Lin, Jaeyoung Do, A. I. Kistijantoro, G. Ganger, Haryadi S. Gunawi","doi":"10.1145/3568427","DOIUrl":"https://doi.org/10.1145/3568427","url":null,"abstract":"Predictable latency on flash storage is a long-pursuit goal, yet unpredictability stays due to the unavoidable disturbance from many well-known SSD internal activities. To combat this issue, the recent NVMe IO Determinism (IOD) interface advocates host-level controls to SSD internal management tasks. Although promising, challenges remain on how to exploit it for truly predictable performance. We present IODA,1 an I/O deterministic flash array design built on top of small but powerful extensions to the IOD interface for easy deployment. IODA exploits data redundancy in the context of IOD for a strong latency predictability contract. In IODA, SSDs are expected to quickly fail an I/O on purpose to allow predictable I/Os through proactive data reconstruction. In the case of concurrent internal operations, IODA introduces busy remaining time exposure and predictable-latency-window formulation to guarantee predictable data reconstructions. Overall, IODA only adds five new fields to the NVMe interface and a small modification in the flash firmware while keeping most of the complexity in the host OS. Our evaluation shows that IODA improves the 95–99.99th latencies by up to 75×. IODA is also the nearest to the ideal, no disturbance case compared to seven state-of-the-art preemption, suspension, GC coordination, partitioning, tiny-tail flash controller, prediction, and proactive approaches.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44129789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems Oasis:对象存储系统扩展中的数据迁移控制
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-19 DOI: 10.1145/3568424
Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu
{"title":"Oasis: Controlling Data Migration in Expansion of Object-based Storage Systems","authors":"Yiming Zhang, Li Wang, Shun Gai, Qiwen Ke, Wenhao Li, Zhenlong Song, Guangtao Xue, J. Shu","doi":"10.1145/3568424","DOIUrl":"https://doi.org/10.1145/3568424","url":null,"abstract":"Object-based storage systems have been widely used for various scenarios such as file storage, block storage, blob (e.g., large videos) storage, and so on, where the data is placed among a large number of object storage devices (OSDs). Data placement is critical for the scalability of decentralized object-based storage systems. The state-of-the-art CRUSH placement method is a decentralized algorithm that deterministically places object replicas onto storage devices without relying on a central directory. While enjoying the benefits of decentralization such as high scalability, robustness, and performance, CRUSH-based storage systems suffer from uncontrolled data migration when expanding the capacity of the storage clusters (i.e., adding new OSDs), which is determined by the nature of CRUSH and will cause significant performance degradation when the expansion is nontrivial. This article presents MapX, a novel extension to CRUSH that uses an extra time-dimension mapping (from object creation times to cluster expansion times) for controlling data migration after cluster expansions. Each expansion is viewed as a new layer of the CRUSH map represented by a virtual node beneath the CRUSH root. MapX controls the mapping from objects onto layers by manipulating the timestamps of the intermediate placement groups (PGs). MapX is applicable to a large variety of object-based storage scenarios where object timestamps can be maintained as higher-level metadata. We have applied MapX to the state-of-the-art Ceph-RBD (RADOS Block Device) to implement a migration-controllable, decentralized object-based block store (called Oasis). Oasis extends the RBD metadata structure to maintain and retrieve approximate object creation times (for migration control) at the granularity of expansion layers. Experimental results show that the MapX-based Oasis block store outperforms the CRUSH-based Ceph-RBD (which is busy in migrating objects after expansions) by 3.17× ∼ 4.31× in tail latency, and 76.3% (respectively, 83.8%) in IOPS for reads (respectively, writes).","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44156164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The what, The from, and The to: The Migration Games in Deduplicated Systems 什么,从,到:重复数据删除系统中的迁移游戏
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-15 DOI: https://dl.acm.org/doi/10.1145/3565025
Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar
{"title":"The what, The from, and The to: The Migration Games in Deduplicated Systems","authors":"Roei Kisous, Ariel Kolikant, Abhinav Duggal, Sarai Sheinvald, Gala Yadgar","doi":"https://dl.acm.org/doi/10.1145/3565025","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3565025","url":null,"abstract":"<p>Deduplication reduces the size of the data stored in large-scale storage systems by replacing duplicate data blocks with references to their unique copies. This creates dependencies between files that contain similar content and complicates the management of data in the system. In this article, we address the problem of data migration, in which files are remapped between different volumes as a result of system expansion or maintenance. The challenge of determining which files and blocks to migrate has been studied extensively for systems without deduplication. In the context of deduplicated storage, however, only simplified migration scenarios have been considered.</p><p>In this article, we formulate the general migration problem for deduplicated systems as an optimization problem whose objective is to minimize the system’s size while ensuring that the storage load is evenly distributed between the system’s volumes and that the network traffic required for the migration does not exceed its allocation.</p><p>We then present three algorithms for generating effective migration plans, each based on a different approach and representing a different trade-off between computation time and migration efficiency. Our <i>greedy algorithm</i> provides modest space savings but is appealing thanks to its exceptionally short runtime. Its results can be improved by using larger system representations. Our <i>theoretically optimal algorithm</i> formulates the migration problem as an integer linear programming (ILP) instance. Its migration plans consistently result in smaller and more balanced systems than those of the greedy approach, although its runtime is long and, as a result, the theoretical optimum is not always found. Our <i>clustering algorithm</i> enjoys the best of both worlds: its migration plans are comparable to those generated by the ILP-based algorithm, but its runtime is shorter, sometimes by an order of magnitude. It can be further accelerated at a modest cost in the quality of its results.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Endurance of Next Generation SSD’s using WOM-v Codes 使用WOM-v码提高下一代SSD的耐久性
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-14 DOI: 10.1145/3565027
Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder
{"title":"Improving the Endurance of Next Generation SSD’s using WOM-v Codes","authors":"Shehbaz Jaffer, K. Mahdaviani, Bianca Schroeder","doi":"10.1145/3565027","DOIUrl":"https://doi.org/10.1145/3565027","url":null,"abstract":"High density Solid State Drives, such as QLC drives, offer increased storage capacity, but a magnitude lower Program and Erase (P/E) cycles, limiting their endurance and hence usability. We present the design and implementation of non-binary, Voltage-Based Write-Once-Memory (WOM-v) Codes to improve the lifetime of QLC drives. First, we develop a FEMU based simulator test-bed to evaluate the gains of WOM-v codes on real world workloads. Second, we propose and implement two optimizations, an efficient garbage collection mechanism and an encoding optimization to drastically improve WOM-v code endurance without compromising performance. Third, we propose analytical approaches to obtain estimates of the endurance gains under WOM-v codes. We analyze the Greedy garbage collection technique with uniform page access distribution and the Least Recently Written (LRW) garbage collection technique with skewed page access distribution in the context of WOM-v codes. We find that although both approaches overestimate the number of required erase operations, the model based on greedy garbage collection with uniform page access distribution provides tighter bounds. A careful evaluation, including microbenchmarks and trace-driven evaluation, demonstrates that WOM-v codes can reduce Erase cycles for QLC drives by 4.4×–11.1× for real world workloads with minimal performance overheads resulting in improved QLC SSD lifetime.","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43627797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Disk Failure Prediction Method Based on Active Semi-supervised Learning 基于主动半监督学习的磁盘故障预测方法
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3523699
Yang Zhou, Fang Wang, Dan Feng
{"title":"A Disk Failure Prediction Method Based on Active Semi-supervised Learning","authors":"Yang Zhou, Fang Wang, Dan Feng","doi":"https://dl.acm.org/doi/10.1145/3523699","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3523699","url":null,"abstract":"<p>Disk failure has always been a major problem for data centers, leading to data loss. Current disk failure prediction approaches are mostly offline and assume that the disk labels required for training learning models are available and accurate. However, these offline methods are no longer suitable for disk failure prediction tasks in large-scale data centers. Behind this explosive amount of data, most methods do not consider whether it is not easy to get the label values during the training or the obtained label values are not completely accurate. These problems further restrict the development of supervised learning and offline modeling in disk failure prediction. In this article, Active Semi-supervised Learning Disk-failure Prediction (<i>ASLDP</i>), a novel disk failure prediction method is proposed, which uses active learning and semi-supervised learning. According to the characteristics of data in the disk lifecycle, <i>ASLDP</i> carries out active learning for those clear labeled samples, which selects valuable samples with the most significant probability uncertainty and eliminates redundancy. For those samples that are unclearly labeled or unlabeled, <i>ASLDP</i> uses semi-supervised learning for pre-labeled by calculating the conditional values of the samples and enhances the generalization ability by active learning. Compared with several state-of-the-art offline and online learning approaches, the results on four realistic datasets from Backblaze and Baidu demonstrate that <i>ASLDP</i> achieves stable failure detection rates of 80–85% with low false alarm rates. In addition, we use a dataset from Alibaba to evaluate the generality of <i>ASLDP</i>. Furthermore, <i>ASLDP</i> can overcome the problem of missing sample labels and data redundancy in large data centers, which are not considered and implemented in all offline learning methods for disk failure prediction to the best of our knowledge. Finally, <i>ASLDP</i> can predict the disk failure 4.9 days in advance with lower overhead and latency.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage 领域:自适应、可重构、擦除编码、原子存储
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-12 DOI: https://dl.acm.org/doi/10.1145/3510613
Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch
{"title":"Ares: Adaptive, Reconfigurable, Erasure coded, Atomic Storage","authors":"Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori Konwar, Muriel Medard, Nancy Lynch","doi":"https://dl.acm.org/doi/10.1145/3510613","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510613","url":null,"abstract":"<p>Emulating a shared <i>atomic</i>, read/write storage system is a fundamental problem in distributed computing. Replicating atomic objects among a set of data hosts was the norm for traditional implementations (e.g., [11]) in order to guarantee the availability and accessibility of the data despite host failures. As replication is highly storage demanding, recent approaches suggested the use of erasure-codes to offer the same fault-tolerance while optimizing storage usage at the hosts. Initial works focused on a fixed set of data hosts. To guarantee longevity and scalability, a storage service should be able to dynamically mask hosts failures by allowing new hosts to join, and failed host to be removed without service interruptions. This work presents the first erasure-code -based atomic algorithm, called <span>Ares</span>, which allows the set of hosts to be modified in the course of an execution. <span>Ares</span> is composed of three main components: (i) a <i>reconfiguration protocol</i>, (ii) a <i>read/write protocol</i>, and (iii) a set of <i>data access primitives</i> <i>(DAPs)</i>. The design of <span>Ares</span> is modular and is such to accommodate the usage of various erasure-code parameters on a per-configuration basis. We provide bounds on the latency of read/write operations and analyze the storage and communication costs of the <span>Ares</span> algorithm.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138542298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tunable Encrypted Deduplication with Attack-resilient Key Management 可调加密重复数据删除与攻击弹性密钥管理
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2022-11-11 DOI: https://dl.acm.org/doi/10.1145/3510614
Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee
{"title":"Tunable Encrypted Deduplication with Attack-resilient Key Management","authors":"Zuoru Yang, Jingwei Li, Yanjing Ren, Patrick P. C. Lee","doi":"https://dl.acm.org/doi/10.1145/3510614","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3510614","url":null,"abstract":"<p>Conventional encrypted deduplication approaches retain the deduplication capability on duplicate chunks after encryption by always deriving the key for encryption/decryption from the chunk content, but such a deterministic nature causes information leakage due to frequency analysis. We present <sans-serif>TED</sans-serif>, a tunable encrypted deduplication primitive that provides a tunable mechanism for balancing the tradeoff between storage efficiency and data confidentiality. The core idea of <sans-serif>TED</sans-serif> is that its key derivation is based on not only the chunk content but also the number of duplicate chunk copies, such that duplicate chunks are encrypted by distinct keys in a controlled manner. In particular, <sans-serif>TED</sans-serif> allows users to configure a storage blowup factor, under which the information leakage quantified by an information-theoretic measure is minimized for any input workload. In addition, we extend <sans-serif>TED</sans-serif> with a distributed key management architecture and propose two attack-resilient key generation schemes that trade between performance and fault tolerance. We implement an encrypted deduplication prototype <sans-serif>TEDStore</sans-serif> to realize <sans-serif>TED</sans-serif> in networked environments. Evaluation on real-world file system snapshots shows that <sans-serif>TED</sans-serif> effectively balances the tradeoff between storage efficiency and data confidentiality, with small performance overhead.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138512871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信