Suraj Prabhakaran, Mohsin Iqbal, S. Rinke, Christian Windisch, F. Wolf
{"title":"A Batch System with Fair Scheduling for Evolving Applications","authors":"Suraj Prabhakaran, Mohsin Iqbal, S. Rinke, Christian Windisch, F. Wolf","doi":"10.1109/ICPP.2014.44","DOIUrl":null,"url":null,"abstract":"Cluster batch systems usually support only static allocation of resources to applications before job start. After job start, applications cannot increase or decrease their resource set. However, some applications unpredictably evolve during execution and thus may require additional resources. If the extra resources cannot be delivered during runtime, those applications may have to run longer to finish, or are not even able to finish when their job's time slice expires. Likewise, a job may have to end without additional resources due to hardware limits being reached, such as the memory available to the compute node. To avoid such scenarios, users have to make large static allocations to account for a potential demand for resources. This leads to wastage of resources as they idle before they might actually be used at an unknown point. In this paper, we propose a batch system with dynamic allocation facilities to support on-the-fly resource allocation to unpredictably evolving jobs based on demand. We present a novel dynamic resource allocation strategy that also accounts for a fair assignment of resources between the usual rigid jobs and the evolving jobs. The results for a CFD production application and a mixed workload of rigid and evolving jobs (based on the widely used ESP benchmark) show that our system not only reduces the job waiting and job turnaround times, but also increases system utilization and system throughput.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"12 9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 43rd International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2014.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Cluster batch systems usually support only static allocation of resources to applications before job start. After job start, applications cannot increase or decrease their resource set. However, some applications unpredictably evolve during execution and thus may require additional resources. If the extra resources cannot be delivered during runtime, those applications may have to run longer to finish, or are not even able to finish when their job's time slice expires. Likewise, a job may have to end without additional resources due to hardware limits being reached, such as the memory available to the compute node. To avoid such scenarios, users have to make large static allocations to account for a potential demand for resources. This leads to wastage of resources as they idle before they might actually be used at an unknown point. In this paper, we propose a batch system with dynamic allocation facilities to support on-the-fly resource allocation to unpredictably evolving jobs based on demand. We present a novel dynamic resource allocation strategy that also accounts for a fair assignment of resources between the usual rigid jobs and the evolving jobs. The results for a CFD production application and a mixed workload of rigid and evolving jobs (based on the widely used ESP benchmark) show that our system not only reduces the job waiting and job turnaround times, but also increases system utilization and system throughput.