A Data-Aware MultiWorkflow Cluster Scheduler

César Acevedo, P. Hernández, Antonio Espinosa, Víctor Méndez
{"title":"A Data-Aware MultiWorkflow Cluster Scheduler","authors":"César Acevedo, P. Hernández, Antonio Espinosa, Víctor Méndez","doi":"10.5220/0005932000950102","DOIUrl":null,"url":null,"abstract":"Previous scheduling research work is based on the analysis of the computational time of application workflows. Current use of clusters deals with the execution of multiworkflows that may share applications and input files. In order to reduce the makespan of such multiworkflows adequate data allocation policies should be applied to reduce input data latency. We propose a scheduling strategy for multiworkflows that considers the data location of shared input files in different locations of the storage system of the cluster. For that, we first merge all workflows in a study and evaluate the global design pattern obtained. Then, we apply a classic list scheduling heuristic considering the location of the input files in the storage system to reduce the communication overhead of the applications. We have evaluated our proposal with an initial set of experimental environments showing promising results of up to 20% makespan improvement.","PeriodicalId":414016,"journal":{"name":"International Conference on Complex Information Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Complex Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0005932000950102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Previous scheduling research work is based on the analysis of the computational time of application workflows. Current use of clusters deals with the execution of multiworkflows that may share applications and input files. In order to reduce the makespan of such multiworkflows adequate data allocation policies should be applied to reduce input data latency. We propose a scheduling strategy for multiworkflows that considers the data location of shared input files in different locations of the storage system of the cluster. For that, we first merge all workflows in a study and evaluate the global design pattern obtained. Then, we apply a classic list scheduling heuristic considering the location of the input files in the storage system to reduce the communication overhead of the applications. We have evaluated our proposal with an initial set of experimental environments showing promising results of up to 20% makespan improvement.
一个数据感知的多工作流集群调度程序
以往的调度研究工作都是基于对应用工作流计算时间的分析。集群的当前用途是处理可能共享应用程序和输入文件的多工作流的执行。为了减少这种多工作流的最大时间跨度,应该应用适当的数据分配策略来减少输入数据延迟。本文提出了一种多工作流调度策略,该策略考虑了共享输入文件在集群存储系统不同位置的数据位置。为此,我们首先合并研究中的所有工作流,并评估获得的全局设计模式。然后,考虑输入文件在存储系统中的位置,采用经典的列表调度启发式算法来减少应用程序的通信开销。我们已经用一组初始实验环境评估了我们的建议,显示出有希望的结果,最大完工时间提高了20%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信