Hierarchical Scheduling in on-demand GPU-as-a-Service Systems

Federica Filippini, M. Lattuada, A. Jahani, M. Ciavotta, D. Ardagna, E. Amaldi
{"title":"Hierarchical Scheduling in on-demand GPU-as-a-Service Systems","authors":"Federica Filippini, M. Lattuada, A. Jahani, M. Ciavotta, D. Ardagna, E. Amaldi","doi":"10.1109/SYNASC51798.2020.00030","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) methods have recently gained popularity. Training this class of models is, however, computing-intensive, and frequently GPUs are used to boost performance. Although the costs of GPU-based systems are gradually reducing due to the high demand, they are still prohibitive: in public clouds, GPU-powered virtual machines (VMs) time unit price is 5-8x higher than CPU-only VMs. While the cloud remains the most cost-effective and flexible deployment, operation costs can be reduced, in large settings, by rightsizing and sharing resources among multiple processes. This work addresses the online joint capacity planning and job scheduling with due dates problem and proposes alternative matheuristic solution methods. Our objective is to optimize operation costs by: i) rightsizing the VM capacities at each node, ii) partitioning the set of GPUs among multiple concurrent jobs on the same VM, and iii) determining a due-date-aware job schedule. The effectiveness of the proposed hierarchical approach, coupled with an appropriate Mixed Integer Linear Programming formulation, is validated against first-principle methods by relying on simulation. The experiments prove that the efficiency of GPU-based systems evaluated in terms of costs can be improved by 50-70%. Finally, scalability analyses show that the proposed approach enables to solve problem instances with up to 100 nodes in less than one minute on average, making it suitable for practical scenarios.","PeriodicalId":278104,"journal":{"name":"2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC51798.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Deep learning (DL) methods have recently gained popularity. Training this class of models is, however, computing-intensive, and frequently GPUs are used to boost performance. Although the costs of GPU-based systems are gradually reducing due to the high demand, they are still prohibitive: in public clouds, GPU-powered virtual machines (VMs) time unit price is 5-8x higher than CPU-only VMs. While the cloud remains the most cost-effective and flexible deployment, operation costs can be reduced, in large settings, by rightsizing and sharing resources among multiple processes. This work addresses the online joint capacity planning and job scheduling with due dates problem and proposes alternative matheuristic solution methods. Our objective is to optimize operation costs by: i) rightsizing the VM capacities at each node, ii) partitioning the set of GPUs among multiple concurrent jobs on the same VM, and iii) determining a due-date-aware job schedule. The effectiveness of the proposed hierarchical approach, coupled with an appropriate Mixed Integer Linear Programming formulation, is validated against first-principle methods by relying on simulation. The experiments prove that the efficiency of GPU-based systems evaluated in terms of costs can be improved by 50-70%. Finally, scalability analyses show that the proposed approach enables to solve problem instances with up to 100 nodes in less than one minute on average, making it suitable for practical scenarios.
按需gpu即服务系统中的分层调度
深度学习(DL)方法最近得到了普及。然而,训练这类模型是计算密集型的,并且经常使用gpu来提高性能。尽管基于gpu的系统的成本由于高需求而逐渐降低,但它们仍然令人望而却步:在公共云中,gpu驱动的虚拟机(vm)时间单位价格比仅cpu的虚拟机高5-8倍。虽然云仍然是最具成本效益和最灵活的部署,但在大型设置中,可以通过在多个进程之间调整大小和共享资源来降低操作成本。本文研究了带到期日的在线联合产能规划和作业调度问题,并提出了可选的数学求解方法。我们的目标是通过以下方式优化操作成本:i)在每个节点上正确调整VM容量,ii)在同一VM上的多个并发作业中划分gpu集,以及iii)确定到期日期感知的作业计划。通过仿真验证了所提出的分层方法的有效性,并结合适当的混合整数线性规划公式对第一性原理方法进行了验证。实验证明,以成本衡量,基于gpu的系统的效率可以提高50-70%。最后,可伸缩性分析表明,所提出的方法能够在平均不到一分钟的时间内解决多达100个节点的问题实例,使其适合于实际场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信