maui调度器用于大规模pbs集群作业模拟的可行性分析

IF 0.2 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Georg Zitzlsberger, B. Jansik, J. Martinovič
{"title":"maui调度器用于大规模pbs集群作业模拟的可行性分析","authors":"Georg Zitzlsberger, B. Jansik, J. Martinovič","doi":"10.33965/IJCSIS_2018130204","DOIUrl":null,"url":null,"abstract":"For large-scale High Performance Computing centers with a wide range of different projects and heterogeneous infrastructures, efficiency is an important consideration. Understanding how compute jobs are scheduled is necessary for improving the job scheduling strategies in order to optimize cluster utilization and job wait times. This increases the importance of a reliable simulation capability, which in turn requires accuracy and comparability with historic workloads from the cluster. Not all job schedulers have a simulation capability, including the Portable Batch System (PBS) resource manager. Hence, PBS based centers have no direct way to simulate changes and optimizations before they are applied to the production system. We propose and discuss how to run job simulations for large-scale PBS based clusters with the Maui Scheduler. This also includes awareness of node downtimes, scheduled and unexpected. For validation purposes, we use historic workloads collected at the IT4Innovations supercomputing center. The viability of our approach is demonstrated by measuring the accuracy of the simulation results compared to the real workloads. In addition, we discuss how the change of the simulator’s time step resolution affects the accuracy as well as simulation times. We are confident that our approach is also transferable to enable job simulations for other computing centers using PBS.","PeriodicalId":41878,"journal":{"name":"IADIS-International Journal on Computer Science and Information Systems","volume":null,"pages":null},"PeriodicalIF":0.2000,"publicationDate":"2018-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Feasibility analysis of using the maui scheduler for job simulation of large-scale pbs based clusters\",\"authors\":\"Georg Zitzlsberger, B. Jansik, J. Martinovič\",\"doi\":\"10.33965/IJCSIS_2018130204\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For large-scale High Performance Computing centers with a wide range of different projects and heterogeneous infrastructures, efficiency is an important consideration. Understanding how compute jobs are scheduled is necessary for improving the job scheduling strategies in order to optimize cluster utilization and job wait times. This increases the importance of a reliable simulation capability, which in turn requires accuracy and comparability with historic workloads from the cluster. Not all job schedulers have a simulation capability, including the Portable Batch System (PBS) resource manager. Hence, PBS based centers have no direct way to simulate changes and optimizations before they are applied to the production system. We propose and discuss how to run job simulations for large-scale PBS based clusters with the Maui Scheduler. This also includes awareness of node downtimes, scheduled and unexpected. For validation purposes, we use historic workloads collected at the IT4Innovations supercomputing center. The viability of our approach is demonstrated by measuring the accuracy of the simulation results compared to the real workloads. In addition, we discuss how the change of the simulator’s time step resolution affects the accuracy as well as simulation times. We are confident that our approach is also transferable to enable job simulations for other computing centers using PBS.\",\"PeriodicalId\":41878,\"journal\":{\"name\":\"IADIS-International Journal on Computer Science and Information Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.2000,\"publicationDate\":\"2018-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IADIS-International Journal on Computer Science and Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33965/IJCSIS_2018130204\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IADIS-International Journal on Computer Science and Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33965/IJCSIS_2018130204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 2

摘要

对于具有各种不同项目和异构基础设施的大型高性能计算中心,效率是一个重要的考虑因素。为了优化集群利用率和作业等待时间,了解如何调度计算作业是改进作业调度策略所必需的。这增加了可靠的模拟功能的重要性,这反过来又需要与集群的历史工作负载的准确性和可比性。并非所有作业调度器都具有模拟功能,包括便携式批处理系统(Portable Batch System, PBS)资源管理器。因此,基于PBS的中心在将更改和优化应用到生产系统之前没有直接的方法来模拟它们。我们提出并讨论了如何使用Maui Scheduler为基于大规模PBS的集群运行作业模拟。这还包括了解节点停机时间,计划的和意外的。出于验证目的,我们使用了在IT4Innovations超级计算中心收集的历史工作负载。通过测量仿真结果与实际工作负载的准确性,证明了我们方法的可行性。此外,还讨论了仿真器时间步长分辨率的变化对仿真精度和仿真次数的影响。我们相信,我们的方法也可以转移到使用PBS的其他计算中心进行工作模拟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Feasibility analysis of using the maui scheduler for job simulation of large-scale pbs based clusters
For large-scale High Performance Computing centers with a wide range of different projects and heterogeneous infrastructures, efficiency is an important consideration. Understanding how compute jobs are scheduled is necessary for improving the job scheduling strategies in order to optimize cluster utilization and job wait times. This increases the importance of a reliable simulation capability, which in turn requires accuracy and comparability with historic workloads from the cluster. Not all job schedulers have a simulation capability, including the Portable Batch System (PBS) resource manager. Hence, PBS based centers have no direct way to simulate changes and optimizations before they are applied to the production system. We propose and discuss how to run job simulations for large-scale PBS based clusters with the Maui Scheduler. This also includes awareness of node downtimes, scheduled and unexpected. For validation purposes, we use historic workloads collected at the IT4Innovations supercomputing center. The viability of our approach is demonstrated by measuring the accuracy of the simulation results compared to the real workloads. In addition, we discuss how the change of the simulator’s time step resolution affects the accuracy as well as simulation times. We are confident that our approach is also transferable to enable job simulations for other computing centers using PBS.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IADIS-International Journal on Computer Science and Information Systems
IADIS-International Journal on Computer Science and Information Systems COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信