{"title":"Simultaneous scheduling of replication and computation for bioinformatics applications on the grid","authors":"F. Desprez, Antoine Vernois","doi":"10.1109/CLADE.2005.1520903","DOIUrl":null,"url":null,"abstract":"One of the first motivations of using grids comes from applications managing large data sets infield such as high energy physics or life sciences. To improve the global throughput of software environments, replicas are usually put at wisely selected sites. Moreover, computation requests have to be scheduled among the available resources. To get the best performance, scheduling and data replication have to be tightly coupled. However, there are few approaches that provide this coupling. This paper presents an algorithm that combines data management and scheduling using a steady-state approach. Our theoretical results are validated using simulation and logs from a large life science application (ACI GRID GriPPS).","PeriodicalId":330715,"journal":{"name":"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.","volume":"71 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLADE.2005.1520903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
One of the first motivations of using grids comes from applications managing large data sets infield such as high energy physics or life sciences. To improve the global throughput of software environments, replicas are usually put at wisely selected sites. Moreover, computation requests have to be scheduled among the available resources. To get the best performance, scheduling and data replication have to be tightly coupled. However, there are few approaches that provide this coupling. This paper presents an algorithm that combines data management and scheduling using a steady-state approach. Our theoretical results are validated using simulation and logs from a large life science application (ACI GRID GriPPS).