{"title":"A Domain-Based Data Distribution Strategy for Fault Tolerance","authors":"Fei Luo, Jianjun Yi","doi":"10.1109/ICSS.2013.46","DOIUrl":null,"url":null,"abstract":"The fault-tolerance is always one of the challenging problems faced with the ever-emerging large-scale cloud storage systems, where data distribution strategy is critical. In this paper, a two-stage data distribution strategy based on domain selection is provided to resolve multiple concurrent disk failures. First, the storage nodes are abstracted as disks in specific containers according to their geographical distribution and network environment, and the storage containers are further divided into different domains. Then by utilizing the proposed domain selection algorithm, r domains are selected from the total domains, where r is the number of copies of the data. Afterwards, each copy of the data is distributed into one of the selected domain, where the data will be stored in the virtual disk with the nearest hashing value in each of those r domains. Analysis and experiments are carried out, which show that the proposed data distribution strategy is efficient and can improve the fault-tolerance.","PeriodicalId":213782,"journal":{"name":"2013 International Conference on Service Sciences (ICSS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Service Sciences (ICSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSS.2013.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The fault-tolerance is always one of the challenging problems faced with the ever-emerging large-scale cloud storage systems, where data distribution strategy is critical. In this paper, a two-stage data distribution strategy based on domain selection is provided to resolve multiple concurrent disk failures. First, the storage nodes are abstracted as disks in specific containers according to their geographical distribution and network environment, and the storage containers are further divided into different domains. Then by utilizing the proposed domain selection algorithm, r domains are selected from the total domains, where r is the number of copies of the data. Afterwards, each copy of the data is distributed into one of the selected domain, where the data will be stored in the virtual disk with the nearest hashing value in each of those r domains. Analysis and experiments are carried out, which show that the proposed data distribution strategy is efficient and can improve the fault-tolerance.