{"title":"Dynamic Multi-Resource Monitoring for Predictive Job Scheduling with ScoPro","authors":"A. Sodan, Lun Liu","doi":"10.1109/CLUSTR.2005.347013","DOIUrl":null,"url":null,"abstract":"Modern job schedulers move towards applying dynamic approaches like time sharing or adaptive resource allocation to accommodate grid jobs or to better utilize local resources. Also, the resources may be heterogeneous and a proper distribution of the application's workload be hard to estimate. Our ScoPro monitoring tool permits to obtain and to store resource-related behavior information for parallel applications. This information is used to create an application signature for predictive use in future runs and to dynamically check competition under time-shared execution and imbalances of workload on heterogeneous resources. ScoPro is applicable to production runs on standard clusters. As main innovative contributions ScoPro can be triggered by job-scheduling events, can monitor several coscheduled jobs concurrently for accurate prediction of slowdowns, and performs realtime short-period measurements with low intrusion during the monitoring, while avoiding any intrusion overhead for the non-monitored part of the job execution","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2005.347013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Modern job schedulers move towards applying dynamic approaches like time sharing or adaptive resource allocation to accommodate grid jobs or to better utilize local resources. Also, the resources may be heterogeneous and a proper distribution of the application's workload be hard to estimate. Our ScoPro monitoring tool permits to obtain and to store resource-related behavior information for parallel applications. This information is used to create an application signature for predictive use in future runs and to dynamically check competition under time-shared execution and imbalances of workload on heterogeneous resources. ScoPro is applicable to production runs on standard clusters. As main innovative contributions ScoPro can be triggered by job-scheduling events, can monitor several coscheduled jobs concurrently for accurate prediction of slowdowns, and performs realtime short-period measurements with low intrusion during the monitoring, while avoiding any intrusion overhead for the non-monitored part of the job execution