Stephan Schlagkamp, Rafael Ferreira da Silva, W. Allcock, E. Deelman, U. Schwiegelshohn
{"title":"Mira超级计算机连续作业提交行为","authors":"Stephan Schlagkamp, Rafael Ferreira da Silva, W. Allcock, E. Deelman, U. Schwiegelshohn","doi":"10.1145/2907294.2907314","DOIUrl":null,"url":null,"abstract":"Understanding user behavior is crucial for the evaluation of scheduling and allocation performances in HPC environments. This paper aims to further understand the dynamic user reaction to different levels of system performance by performing a comprehensive analysis of user behavior in recorded data in the form of delays in the subsequent job submission behavior. Therefore, we characterize a workload trace covering one year of job submissions from the Mira supercomputer at ALCF (Argonne Leadership Computing Facility). We perform an in-depth analysis of correlations between job characteristics, system performance metrics, and the subsequent user behavior. Analysis results show that the user behavior is significantly influenced by long waiting times, and that complex jobs (number of nodes and CPU hours) lead to longer delays in subsequent job submissions.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Consecutive Job Submission Behavior at Mira Supercomputer\",\"authors\":\"Stephan Schlagkamp, Rafael Ferreira da Silva, W. Allcock, E. Deelman, U. Schwiegelshohn\",\"doi\":\"10.1145/2907294.2907314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding user behavior is crucial for the evaluation of scheduling and allocation performances in HPC environments. This paper aims to further understand the dynamic user reaction to different levels of system performance by performing a comprehensive analysis of user behavior in recorded data in the form of delays in the subsequent job submission behavior. Therefore, we characterize a workload trace covering one year of job submissions from the Mira supercomputer at ALCF (Argonne Leadership Computing Facility). We perform an in-depth analysis of correlations between job characteristics, system performance metrics, and the subsequent user behavior. Analysis results show that the user behavior is significantly influenced by long waiting times, and that complex jobs (number of nodes and CPU hours) lead to longer delays in subsequent job submissions.\",\"PeriodicalId\":20515,\"journal\":{\"name\":\"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2907294.2907314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2907294.2907314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Consecutive Job Submission Behavior at Mira Supercomputer
Understanding user behavior is crucial for the evaluation of scheduling and allocation performances in HPC environments. This paper aims to further understand the dynamic user reaction to different levels of system performance by performing a comprehensive analysis of user behavior in recorded data in the form of delays in the subsequent job submission behavior. Therefore, we characterize a workload trace covering one year of job submissions from the Mira supercomputer at ALCF (Argonne Leadership Computing Facility). We perform an in-depth analysis of correlations between job characteristics, system performance metrics, and the subsequent user behavior. Analysis results show that the user behavior is significantly influenced by long waiting times, and that complex jobs (number of nodes and CPU hours) lead to longer delays in subsequent job submissions.