Region based fault-tolerant distributed file storage system design under budget constraint

2014 6th International Workshop on Reliable Networks Design and Modeling (RNDM) Pub Date : 2014-01-19 DOI:10.1109/RNDM.2014.7014932

Anisha Mazumder, Arun Das, Chenyang Zhou, Arunabha Sen

{"title":"Region based fault-tolerant distributed file storage system design under budget constraint","authors":"Anisha Mazumder, Arun Das, Chenyang Zhou, Arunabha Sen","doi":"10.1109/RNDM.2014.7014932","DOIUrl":null,"url":null,"abstract":"Two independent lines of research, (i) erasure code based file storage system design, and (ii) fault-tolerant network design for spatially correlated (or region-based) failures, have received considerable attention in the networking research community in recent times. A recently proposed (N,K)-coding based distributed file storage scheme ensures complete reconstruction of a file after network fragmentation due to any single region-based fault. For every region of the network, it stores K distinct file segments in one of the largest connected component that results from the fragmentation of the network due to the failure of a region. This distribution scheme provides an all-region fault-tolerant storage system, in the sense that no matter which region of the network fails, a largest connected component of the fragmented network will still have enough distinct file segments with which to reconstruct the file. However, the storage requirement and the associated cost for such an all-region-fault-tolerant storage system may be quite high. As such, with a limited budget it may not be possible to realize such an all-region fault-tolerant storage system. We consider a budget constrained distributed file system design problem and provide solutions that maximizes the number of regions that can be made fault-tolerant, within the specified budget. We show that the problem is NP-complete, and provide an approximation algorithm for the problem. The performance of the approximation algorithm is evaluated through simulation on two real networks. The simulation results demonstrate that the worst case experimental performance is significantly better than the worst case theoretical bound. Moreover, the approximation algorithm almost always produce near optimal solution in a fraction of time needed to find the optimal solution.","PeriodicalId":299072,"journal":{"name":"2014 6th International Workshop on Reliable Networks Design and Modeling (RNDM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 6th International Workshop on Reliable Networks Design and Modeling (RNDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RNDM.2014.7014932","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Two independent lines of research, (i) erasure code based file storage system design, and (ii) fault-tolerant network design for spatially correlated (or region-based) failures, have received considerable attention in the networking research community in recent times. A recently proposed (N,K)-coding based distributed file storage scheme ensures complete reconstruction of a file after network fragmentation due to any single region-based fault. For every region of the network, it stores K distinct file segments in one of the largest connected component that results from the fragmentation of the network due to the failure of a region. This distribution scheme provides an all-region fault-tolerant storage system, in the sense that no matter which region of the network fails, a largest connected component of the fragmented network will still have enough distinct file segments with which to reconstruct the file. However, the storage requirement and the associated cost for such an all-region-fault-tolerant storage system may be quite high. As such, with a limited budget it may not be possible to realize such an all-region fault-tolerant storage system. We consider a budget constrained distributed file system design problem and provide solutions that maximizes the number of regions that can be made fault-tolerant, within the specified budget. We show that the problem is NP-complete, and provide an approximation algorithm for the problem. The performance of the approximation algorithm is evaluated through simulation on two real networks. The simulation results demonstrate that the worst case experimental performance is significantly better than the worst case theoretical bound. Moreover, the approximation algorithm almost always produce near optimal solution in a fraction of time needed to find the optimal solution.

查看原文本刊更多论文

预算约束下基于区域的容错分布式文件存储系统设计

两个独立的研究方向，(i)基于擦除码的文件存储系统设计，以及(ii)针对空间相关(或基于区域的)故障的容错网络设计，近年来在网络研究界受到了相当大的关注。最近提出了一种基于(N,K)编码的分布式文件存储方案，可确保在任何单个基于区域的故障导致网络分片后文件的完全重建。对于网络的每个区域，它将K个不同的文件段存储在一个最大的连接组件中，该组件是由于某个区域的故障而导致网络碎片的结果。这种分布方案提供了一个全区域容错存储系统，也就是说，无论网络的哪个区域发生故障，碎片网络中最大的连接组件仍然有足够的不同的文件段来重建文件。然而，这种全区域容错存储系统的存储需求和相关成本可能相当高。因此，在预算有限的情况下，可能无法实现这种全区域容错存储系统。我们考虑了一个预算受限的分布式文件系统设计问题，并提供了在指定预算范围内最大化可容错区域数量的解决方案。我们证明了该问题是np完全的，并给出了该问题的近似算法。通过对两个真实网络的仿真，对该近似算法的性能进行了评价。仿真结果表明，最坏情况下的实验性能明显优于最坏情况下的理论边界。此外，近似算法几乎总是在寻找最优解所需的一小部分时间内产生接近最优解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 6th International Workshop on Reliable Networks Design and Modeling (RNDM)

自引率

0.00%

发文量