一种基于分组的移位聚类布局的可扩展反向查找方案

2011 IEEE International Parallel & Distributed Processing Symposium Pub Date : 2011-05-16 DOI:10.1109/IPDPS.2011.64

Junyao Zhang, Pengju Shang, Jun Wang

{"title":"一种基于分组的移位聚类布局的可扩展反向查找方案","authors":"Junyao Zhang, Pengju Shang, Jun Wang","doi":"10.1109/IPDPS.2011.64","DOIUrl":null,"url":null,"abstract":"Recent years have witnessed an increasing demand for super data clusters. The super data clusters have reached the petabyte-scale that can consist of thousands or tens of thousands storage nodes at a single site. For this architecture, reliability is becoming a great concern. In order to achieve a high reliability, data recovery and node reconstruction is a must. Although extensive research works have investigated how to sustain high performance and high reliability in case of node failures at large scale, a reverse lookup problem, namely finding the objects list for the failed node remains open. This is especially true for storage systems with high requirement of data integrity and availability, such as scientific research data clusters and etc. Existing solutions are either time consuming or expensive. Meanwhile, replication based block placement can be used to realize fast reverse lookup. However, they are designed for centralized, small-scale storage architectures. In this paper, we propose a fast and efficient reverse lookup scheme named Group-based Shifted Declustering (G-SD) layout that is able to locate the whole content of the failed node. G-SD extends our previous shifted declustering layout and applies to large-scale file systems. Our mathematical proofs and real-life experiments show that G-SD is a scalable reverse lookup scheme that is up to one order of magnitude faster than existing schemes.","PeriodicalId":355100,"journal":{"name":"2011 IEEE International Parallel & Distributed Processing Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Scalable Reverse Lookup Scheme Using Group-Based Shifted Declustering Layout\",\"authors\":\"Junyao Zhang, Pengju Shang, Jun Wang\",\"doi\":\"10.1109/IPDPS.2011.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years have witnessed an increasing demand for super data clusters. The super data clusters have reached the petabyte-scale that can consist of thousands or tens of thousands storage nodes at a single site. For this architecture, reliability is becoming a great concern. In order to achieve a high reliability, data recovery and node reconstruction is a must. Although extensive research works have investigated how to sustain high performance and high reliability in case of node failures at large scale, a reverse lookup problem, namely finding the objects list for the failed node remains open. This is especially true for storage systems with high requirement of data integrity and availability, such as scientific research data clusters and etc. Existing solutions are either time consuming or expensive. Meanwhile, replication based block placement can be used to realize fast reverse lookup. However, they are designed for centralized, small-scale storage architectures. In this paper, we propose a fast and efficient reverse lookup scheme named Group-based Shifted Declustering (G-SD) layout that is able to locate the whole content of the failed node. G-SD extends our previous shifted declustering layout and applies to large-scale file systems. Our mathematical proofs and real-life experiments show that G-SD is a scalable reverse lookup scheme that is up to one order of magnitude faster than existing schemes.\",\"PeriodicalId\":355100,\"journal\":{\"name\":\"2011 IEEE International Parallel & Distributed Processing Symposium\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Parallel & Distributed Processing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2011.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Parallel & Distributed Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2011.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

近年来，对超级数据集群的需求不断增加。超级数据集群已经达到pb级规模，可以在单个站点上包含数千或数万个存储节点。对于这种体系结构，可靠性正在成为一个非常重要的问题。为了实现高可靠性，必须进行数据恢复和节点重构。尽管大量的研究工作已经研究了如何在大规模节点故障的情况下保持高性能和高可靠性，但反向查找问题(即查找故障节点的对象列表)仍然存在。特别是对于对数据完整性和可用性要求较高的存储系统，如科研数据集群等。现有的解决方案要么耗时，要么昂贵。同时，基于复制的块放置可以实现快速反向查找。然而，它们是为集中的、小规模的存储架构而设计的。在本文中，我们提出了一种快速有效的反向查找方案，称为基于组的移位退聚(G-SD)布局，能够定位故障节点的整个内容。G-SD扩展了我们以前的移位的集群布局，并适用于大型文件系统。我们的数学证明和实际实验表明，G-SD是一种可扩展的反向查找方案，比现有方案快一个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Scalable Reverse Lookup Scheme Using Group-Based Shifted Declustering Layout

Recent years have witnessed an increasing demand for super data clusters. The super data clusters have reached the petabyte-scale that can consist of thousands or tens of thousands storage nodes at a single site. For this architecture, reliability is becoming a great concern. In order to achieve a high reliability, data recovery and node reconstruction is a must. Although extensive research works have investigated how to sustain high performance and high reliability in case of node failures at large scale, a reverse lookup problem, namely finding the objects list for the failed node remains open. This is especially true for storage systems with high requirement of data integrity and availability, such as scientific research data clusters and etc. Existing solutions are either time consuming or expensive. Meanwhile, replication based block placement can be used to realize fast reverse lookup. However, they are designed for centralized, small-scale storage architectures. In this paper, we propose a fast and efficient reverse lookup scheme named Group-based Shifted Declustering (G-SD) layout that is able to locate the whole content of the failed node. G-SD extends our previous shifted declustering layout and applies to large-scale file systems. Our mathematical proofs and real-life experiments show that G-SD is a scalable reverse lookup scheme that is up to one order of magnitude faster than existing schemes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE International Parallel & Distributed Processing Symposium

自引率

0.00%

发文量