Efficient Job Scheduling for Clusters with Shared Tiered Storage

Leah E. Lackner, H. M. Fard, F. Wolf
{"title":"Efficient Job Scheduling for Clusters with Shared Tiered Storage","authors":"Leah E. Lackner, H. M. Fard, F. Wolf","doi":"10.1109/CCGRID.2019.00046","DOIUrl":null,"url":null,"abstract":"New fast storage technologies such as non-volatile memory are becoming ubiquitous in HPC systems with one or two orders of magnitude higher I/O bandwidth than traditional back-end storage systems. They can be used to heavily speed-up I/O operations, an essential prerequisite for data-intensive exascale computing capabilities. However, since the overall capacity of the fast storage available in a system is limited, an individual job may not always benefit if access to fast storage implies longer waiting time in the queue. This is obvious if fast storage is shared across the system. We therefore argue that the decision of whether or not to use fast storage should be supported by the batch scheduler, which can estimate when the amount of fast storage a job desires will become available. We present a scheduling algorithm with this functionality and show in simulations significantly reduced makespan and turnaround times in comparison to always using fast storage, always using slow back-end storage, and random storage assignment.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2019.00046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

New fast storage technologies such as non-volatile memory are becoming ubiquitous in HPC systems with one or two orders of magnitude higher I/O bandwidth than traditional back-end storage systems. They can be used to heavily speed-up I/O operations, an essential prerequisite for data-intensive exascale computing capabilities. However, since the overall capacity of the fast storage available in a system is limited, an individual job may not always benefit if access to fast storage implies longer waiting time in the queue. This is obvious if fast storage is shared across the system. We therefore argue that the decision of whether or not to use fast storage should be supported by the batch scheduler, which can estimate when the amount of fast storage a job desires will become available. We present a scheduling algorithm with this functionality and show in simulations significantly reduced makespan and turnaround times in comparison to always using fast storage, always using slow back-end storage, and random storage assignment.
基于共享分级存储集群的高效作业调度
新的快速存储技术,如非易失性存储器,在高性能计算系统中变得无处不在,其I/O带宽比传统的后端存储系统高一到两个数量级。它们可用于大幅加速I/O操作,这是数据密集型百亿亿级计算能力的必要先决条件。但是,由于系统中可用的快速存储的总体容量是有限的,如果访问快速存储意味着在队列中等待更长的时间,则单个作业可能并不总是受益。如果跨系统共享快速存储,这一点很明显。因此,我们认为是否使用快速存储的决定应该由批调度程序支持,它可以估计作业所需的快速存储量何时可用。我们提出了一种具有此功能的调度算法,并在模拟中显示,与始终使用快速存储、始终使用慢速后端存储和随机存储分配相比,完成时间和周转时间显著减少。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信