Impact of data placement on resilience in large-scale object storage systems

2016 32nd Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2016-05-02 DOI:10.1109/MSST.2016.7897091

P. Carns, K. Harms, John Jenkins, M. Mubarak, R. Ross, C. Carothers

{"title":"Impact of data placement on resilience in large-scale object storage systems","authors":"P. Carns, K. Harms, John Jenkins, M. Mubarak, R. Ross, C. Carothers","doi":"10.1109/MSST.2016.7897091","DOIUrl":null,"url":null,"abstract":"Distributed object storage architectures have become the de facto standard for high-performance storage in big data, cloud, and HPC computing. Object storage deployments using commodity hardware to reduce costs often employ object replication as a method to achieve data resilience. Repairing object replicas after failure is a daunting task for systems with thousands of servers and billions of objects, however, and it is increasingly difficult to evaluate such scenarios at scale on real-world systems. Resilience and availability are both compromised if objects are not repaired in a timely manner. In this work we leverage a high-fidelity discrete-event simulation model to investigate replica reconstruction on large-scale object storage systems with thousands of servers, billions of objects, and petabytes of data. We evaluate the behavior of CRUSH, a well-known object placement algorithm, and identify configuration scenarios in which aggregate rebuild performance is constrained by object placement policies. After determining the root cause of this bottleneck, we then propose enhancements to CRUSH and the usage policies atop it to enable scalable replica reconstruction. We use these methods to demonstrate a simulated aggregate rebuild rate of 410 GiB/s (within 5% of projected ideal linear scaling) on a 1,024-node commodity storage system. We also uncover an unexpected phenomenon in rebuild performance based on the characteristics of the data stored on the system.","PeriodicalId":299251,"journal":{"name":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2016.7897091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Distributed object storage architectures have become the de facto standard for high-performance storage in big data, cloud, and HPC computing. Object storage deployments using commodity hardware to reduce costs often employ object replication as a method to achieve data resilience. Repairing object replicas after failure is a daunting task for systems with thousands of servers and billions of objects, however, and it is increasingly difficult to evaluate such scenarios at scale on real-world systems. Resilience and availability are both compromised if objects are not repaired in a timely manner. In this work we leverage a high-fidelity discrete-event simulation model to investigate replica reconstruction on large-scale object storage systems with thousands of servers, billions of objects, and petabytes of data. We evaluate the behavior of CRUSH, a well-known object placement algorithm, and identify configuration scenarios in which aggregate rebuild performance is constrained by object placement policies. After determining the root cause of this bottleneck, we then propose enhancements to CRUSH and the usage policies atop it to enable scalable replica reconstruction. We use these methods to demonstrate a simulated aggregate rebuild rate of 410 GiB/s (within 5% of projected ideal linear scaling) on a 1,024-node commodity storage system. We also uncover an unexpected phenomenon in rebuild performance based on the characteristics of the data stored on the system.

查看原文本刊更多论文

大规模对象存储系统中数据放置对弹性的影响

分布式对象存储架构已经成为大数据、云和高性能计算中高性能存储的事实上的标准。使用商用硬件来降低成本的对象存储部署通常采用对象复制作为实现数据弹性的方法。然而，对于拥有数千台服务器和数十亿个对象的系统来说，在故障后修复对象副本是一项艰巨的任务，并且在现实世界的系统上大规模评估此类场景越来越困难。如果不及时修复对象，弹性和可用性都会受到损害。在这项工作中，我们利用高保真离散事件模拟模型来研究具有数千台服务器，数十亿对象和pb数据的大型对象存储系统上的副本重建。我们评估了CRUSH(一种著名的对象放置算法)的行为，并确定了聚合重建性能受对象放置策略约束的配置场景。在确定瓶颈的根本原因之后，我们提出对CRUSH及其使用策略的增强，以支持可伸缩的副本重建。我们使用这些方法在1024个节点的商品存储系统上演示了410 GiB/s的模拟聚合重建速率(在预测的理想线性扩展的5%以内)。我们还发现了基于存储在系统上的数据的特征重建性能的一个意想不到的现象。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 32nd Symposium on Mass Storage Systems and Technologies (MSST)

自引率

0.00%

发文量