{"title":"Integration of Task Scheduling with Replica Placement in Data Grid for Limited Disk Space of Resources","authors":"Kan Yi, Feng Ding, Heng Wang","doi":"10.1109/ChinaGrid.2010.29","DOIUrl":null,"url":null,"abstract":"Data grid integrates geographically distributed resources for solving data-sensitive scientific applications. As tasks are sensitive to data, dealing with large amount of data makes the requirement for efficiency in data access more critical. The goal of replica placement is to shorten data access time for enhancing the task execution performance. Therefore, replica placement strategies are often integral to task scheduling algorithms. However, all existing integration strategies make an assumption that the disk space of resources in data grid is unlimited. In this paper, we extended MinMin heuristic to cater to the situation where the disk space of a computational resource is limited. In addition, a heuristic replica placement algorithm is proposed, in which the limited disk space of a storage resource is considered as well. Another character of this heuristic replica placement algorithm is that it can map more than one hot file to several storage resources. We study our approach and evaluate it through simulation. The result shows that the integration of the two algorithms has improved the performance of data grid especially when the whole disk space of storage resources is relatively smaller than the amount of all data files.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"17 22","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Fifth Annual ChinaGrid Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ChinaGrid.2010.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Data grid integrates geographically distributed resources for solving data-sensitive scientific applications. As tasks are sensitive to data, dealing with large amount of data makes the requirement for efficiency in data access more critical. The goal of replica placement is to shorten data access time for enhancing the task execution performance. Therefore, replica placement strategies are often integral to task scheduling algorithms. However, all existing integration strategies make an assumption that the disk space of resources in data grid is unlimited. In this paper, we extended MinMin heuristic to cater to the situation where the disk space of a computational resource is limited. In addition, a heuristic replica placement algorithm is proposed, in which the limited disk space of a storage resource is considered as well. Another character of this heuristic replica placement algorithm is that it can map more than one hot file to several storage resources. We study our approach and evaluate it through simulation. The result shows that the integration of the two algorithms has improved the performance of data grid especially when the whole disk space of storage resources is relatively smaller than the amount of all data files.