Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids

Ligang He, S. Jarvis, D. P. Spooner, Xinuo Chen, G. Nudd
{"title":"Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids","authors":"Ligang He, S. Jarvis, D. P. Spooner, Xinuo Chen, G. Nudd","doi":"10.1109/GRID.2004.27","DOIUrl":null,"url":null,"abstract":"This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.","PeriodicalId":335281,"journal":{"name":"Fifth IEEE/ACM International Workshop on Grid Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth IEEE/ACM International Workshop on Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRID.2004.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43

Abstract

This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.
多集群和网格中具有QoS要求的并行作业的动态调度
本文研究了多集群和网格环境下具有QoS要求(软截止日期)的并行作业的动态调度问题。将三个指标(截止日期、完工时间和空闲时间)与可变权重相结合来评估调度性能。这三个指标用于度量作业的QoS需求遵从程度、资源吞吐量和资源利用率。在多集群中应用了两个级别的性能优化。在多集群级别,调度程序(我们称之为MUSCLE)将具有高打包潜力的并行作业分配到同一集群;它还考虑了作业的QoS需求,并采用启发式方法将适当的工作负载分配给每个集群,以平衡整体系统性能。在单个集群级别,现有的工作负载管理器TITAN利用遗传算法进一步提高以前由MUSCLE分配的作业的调度性能。为了验证调度机制的有效性以及预测精度对调度性能的影响,进行了大量的实验研究。结果表明,与传统的分布式工作负载分配策略相比,并行作业的综合调度性能在多集群上得到了显著提高,并且预测误差的存在并未显著削弱性能优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信