Decentralized replication strategies for P2P based Scientific Data Grid

2008 International Symposium on Information Technology Pub Date : 2008-09-26 DOI:10.1109/ITSIM.2008.4632073

A. Abdullah, M. Othman, Hamidah Ibrahim, Md Nasir Sulaiman, A. T. Othman

{"title":"Decentralized replication strategies for P2P based Scientific Data Grid","authors":"A. Abdullah, M. Othman, Hamidah Ibrahim, Md Nasir Sulaiman, A. T. Othman","doi":"10.1109/ITSIM.2008.4632073","DOIUrl":null,"url":null,"abstract":"Scientific Data Grid provides geographically distributed resources for large-scale data-intensive applications that generate large scientific data sets and it mostly deals with large computational problems. Research in the area of grid has given various ideas and solutions to address these requirements. However, since the number of participants (scientists and institutes) that involve in this kind of environment is increasing tremendously, scalability, availability and reliability have been the core problem for such system. Peer-to-peer (P2P) is one of the architecture that promising scale and dynamism environment. In this paper, we present a P2P model for Scientific Data Grid that utilizes the P2P services to address those problems. For the purpose of this study, we have developed and used our own data grid simulation written using PARSEC. In this paper, we illustrate our P2P Scientific Data Grid model, our data grid simulation and the design of proposed data replication strategies. We then analyze the performance of data discovery service with and without the existence of replication strategies relative to their success rates, response time, average number of hop and bandwidth consumption. The results from simulation study that show how the proposed replication strategies promote high data availability in the proposed Scientific Data Grid model and how these strategies improve the discovery process are presented.","PeriodicalId":314159,"journal":{"name":"2008 International Symposium on Information Technology","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposium on Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITSIM.2008.4632073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

Abstract

Scientific Data Grid provides geographically distributed resources for large-scale data-intensive applications that generate large scientific data sets and it mostly deals with large computational problems. Research in the area of grid has given various ideas and solutions to address these requirements. However, since the number of participants (scientists and institutes) that involve in this kind of environment is increasing tremendously, scalability, availability and reliability have been the core problem for such system. Peer-to-peer (P2P) is one of the architecture that promising scale and dynamism environment. In this paper, we present a P2P model for Scientific Data Grid that utilizes the P2P services to address those problems. For the purpose of this study, we have developed and used our own data grid simulation written using PARSEC. In this paper, we illustrate our P2P Scientific Data Grid model, our data grid simulation and the design of proposed data replication strategies. We then analyze the performance of data discovery service with and without the existence of replication strategies relative to their success rates, response time, average number of hop and bandwidth consumption. The results from simulation study that show how the proposed replication strategies promote high data availability in the proposed Scientific Data Grid model and how these strategies improve the discovery process are presented.

查看原文本刊更多论文

基于P2P的科学数据网格分散复制策略

科学数据网格为生成大型科学数据集的大规模数据密集型应用提供了地理上分布的资源，它主要处理大型计算问题。网格领域的研究为满足这些需求提供了各种各样的想法和解决方案。然而，由于参与这种环境的参与者(科学家和研究所)的数量正在急剧增加，可扩展性、可用性和可靠性一直是这种系统的核心问题。P2P (Peer-to-peer)是一种具有规模和动态性的网络架构。在本文中，我们提出了一个利用P2P服务解决这些问题的科学数据网格的P2P模型。为了本研究的目的，我们开发并使用了使用PARSEC编写的自己的数据网格模拟。在本文中，我们阐述了我们的P2P科学数据网格模型，我们的数据网格仿真和提出的数据复制策略的设计。然后，我们分析了数据发现服务在存在和不存在复制策略的情况下的性能，以及它们的成功率、响应时间、平均跳数和带宽消耗。仿真研究结果显示了所提出的复制策略如何在所提出的科学数据网格模型中提高数据的高可用性，以及这些策略如何改进发现过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2008 International Symposium on Information Technology

自引率

0.00%

发文量