{"title":"A Survey of Modern Scientific Workflow Scheduling Algorithms and Systems in the Era of Big Data","authors":"Junwen Liu, Shiyong Lu, D. Che","doi":"10.1109/SCC49832.2020.00026","DOIUrl":null,"url":null,"abstract":"This paper provides a survey of the state-of-the-art workflow scheduling algorithms with the assumption of cloud computing being used as the underlying compute infrastructure in support of large-scale scientific workflows involving big data. The survey also reviews a few selected representative scientific workflow systems in light of usability, performance, popularity, and other prominent features. In contrast to existing related surveys, which most try to be comprehensive in coverage and inevitably fall short in the depth of their coverage on workflow scheduling, this survey puts an emphasis on the two dominant factors in workflow scheduling, the makespan and the monetary cost of workflow execution, resulted in a useful taxonomy of workflow scheduling algorithms as an additional contribution. This survey tries to maintain a good balance between width and depth in its coverage – after a broad review, it spotlights on selected top ten representative scheduling algorithms and top five workflow management systems leveraging cloud infrastructure with an emphasis on support for big data scientific workflows.","PeriodicalId":274909,"journal":{"name":"2020 IEEE International Conference on Services Computing (SCC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Services Computing (SCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC49832.2020.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper provides a survey of the state-of-the-art workflow scheduling algorithms with the assumption of cloud computing being used as the underlying compute infrastructure in support of large-scale scientific workflows involving big data. The survey also reviews a few selected representative scientific workflow systems in light of usability, performance, popularity, and other prominent features. In contrast to existing related surveys, which most try to be comprehensive in coverage and inevitably fall short in the depth of their coverage on workflow scheduling, this survey puts an emphasis on the two dominant factors in workflow scheduling, the makespan and the monetary cost of workflow execution, resulted in a useful taxonomy of workflow scheduling algorithms as an additional contribution. This survey tries to maintain a good balance between width and depth in its coverage – after a broad review, it spotlights on selected top ten representative scheduling algorithms and top five workflow management systems leveraging cloud infrastructure with an emphasis on support for big data scientific workflows.