{"title":"Performance Modeling and Task Scheduling in Distributed Graph Processing","authors":"Daniel Presser, Frank Siqueira, Fábio Reina","doi":"10.1109/BigDataCongress.2018.00025","DOIUrl":null,"url":null,"abstract":"The accelerated growth of datasets observed in modern applications also applies to datasets modeled as graphs. To handle this problem, several large scale distributed graph processing models have been proposed, such as Pregel. These systems are designed to run in large clusters, where the resources must be allocated efficiently. In this paper we present a prediction model and a scheduler for Pregel-based distributed graph processing jobs. The jobs are treated as moldable tasks by the scheduler that, based on the predictions, allocates the best number of workers to each job in order to minimize makespan. Experimental results show that the prediction model has accuracy close to 90%, allowing the scheduler to work within the theoretical approximation limits of the optimal makespan.","PeriodicalId":177250,"journal":{"name":"2018 IEEE International Congress on Big Data (BigData Congress)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Congress on Big Data (BigData Congress)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BigDataCongress.2018.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The accelerated growth of datasets observed in modern applications also applies to datasets modeled as graphs. To handle this problem, several large scale distributed graph processing models have been proposed, such as Pregel. These systems are designed to run in large clusters, where the resources must be allocated efficiently. In this paper we present a prediction model and a scheduler for Pregel-based distributed graph processing jobs. The jobs are treated as moldable tasks by the scheduler that, based on the predictions, allocates the best number of workers to each job in order to minimize makespan. Experimental results show that the prediction model has accuracy close to 90%, allowing the scheduler to work within the theoretical approximation limits of the optimal makespan.