MCRTREE: A Mutually Cooperative Recovery Scheme for Multiple Losses in Distributed Storage Systems Based on Tree Structure

2014 9th IEEE International Conference on Networking, Architecture, and Storage Pub Date : 2014-08-06 DOI:10.1109/NAS.2014.33

Xiaoqiang Pei, Yijie Wang, Xingkong Ma, Yongquan Fu, Fangliang Xu

{"title":"MCRTREE: A Mutually Cooperative Recovery Scheme for Multiple Losses in Distributed Storage Systems Based on Tree Structure","authors":"Xiaoqiang Pei, Yijie Wang, Xingkong Ma, Yongquan Fu, Fangliang Xu","doi":"10.1109/NAS.2014.33","DOIUrl":null,"url":null,"abstract":"To guarantee the reliability of distributed storage systems, erasure coding, as a redundant scheme, has received increasingly attention because it can greatly improve the space efficiency compared with the replica schemes. However, it takes a long time and consumes a lot of network bandwidth for erasure coding to repair the lost data on failed nodes. The state-of-art studies focus on the repairing optimization for the single-node-failure context. Real-world experiments have clearly shown that multi-node failures indeed happen in cloud storage systems. Borrowing single-node repairing techniques to the multi-node setting faces challenges on the efficiency. We propose a mutually cooperative recovery scheme MCRTREE based on the tree structure for multiple node failures. MCRTREE improves the bandwidth utilization and reduces the repair time by the construction of regeneration trees between each new node (denoted as newcomers) and alive nodes (denoted as providers). Further, MCRTREE reduces the size of the data volumes to be transmitted for the repair process. Numerical experiments show that MCRTREE consumes less storage cost and the maintenance bandwidth compared with other redundancy recovery schemes. Trace-driven simulation results reveal that the MCRTREE reduces the regeneration time by 30% - 50%, improves the successful regeneration probability by 10% - 20% and the data availability by 10% - 20% compared with the typical repair schemes.","PeriodicalId":186621,"journal":{"name":"2014 9th IEEE International Conference on Networking, Architecture, and Storage","volume":"144 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 9th IEEE International Conference on Networking, Architecture, and Storage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2014.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

To guarantee the reliability of distributed storage systems, erasure coding, as a redundant scheme, has received increasingly attention because it can greatly improve the space efficiency compared with the replica schemes. However, it takes a long time and consumes a lot of network bandwidth for erasure coding to repair the lost data on failed nodes. The state-of-art studies focus on the repairing optimization for the single-node-failure context. Real-world experiments have clearly shown that multi-node failures indeed happen in cloud storage systems. Borrowing single-node repairing techniques to the multi-node setting faces challenges on the efficiency. We propose a mutually cooperative recovery scheme MCRTREE based on the tree structure for multiple node failures. MCRTREE improves the bandwidth utilization and reduces the repair time by the construction of regeneration trees between each new node (denoted as newcomers) and alive nodes (denoted as providers). Further, MCRTREE reduces the size of the data volumes to be transmitted for the repair process. Numerical experiments show that MCRTREE consumes less storage cost and the maintenance bandwidth compared with other redundancy recovery schemes. Trace-driven simulation results reveal that the MCRTREE reduces the regeneration time by 30% - 50%, improves the successful regeneration probability by 10% - 20% and the data availability by 10% - 20% compared with the typical repair schemes.

查看原文本刊更多论文

MCRTREE:一种基于树形结构的分布式存储系统多重损失协同恢复方案

为了保证分布式存储系统的可靠性，擦除编码作为一种冗余方案，与复制方案相比，它可以大大提高存储系统的空间利用率，越来越受到人们的关注。但是，对故障节点上丢失的数据进行纠删编码，耗时长，占用网络带宽大。目前的研究主要集中在单节点故障情况下的修复优化问题。现实世界的实验已经清楚地表明，在云存储系统中确实会发生多节点故障。将单节点修复技术应用于多节点设置，在效率上面临挑战。针对多节点故障，提出了一种基于树型结构的相互协作恢复方案MCRTREE。MCRTREE通过在每个新节点(表示为新节点)和活动节点(表示为提供者)之间构建再生树，提高了带宽利用率，减少了修复时间。此外，MCRTREE减少了为修复过程传输的数据量的大小。数值实验表明，与其他冗余恢复方案相比，MCRTREE所消耗的存储成本和维护带宽更少。轨迹驱动仿真结果表明，与传统修复方案相比，MCRTREE再生时间缩短30% ~ 50%，再生成功率提高10% ~ 20%，数据可用性提高10% ~ 20%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 9th IEEE International Conference on Networking, Architecture, and Storage

自引率

0.00%

发文量