{"title":"RLSK:基于强化学习的联邦Kubernetes集群的作业调度器","authors":"Jiaming Huang, C. Xiao, Weigang Wu","doi":"10.1109/IC2E48712.2020.00019","DOIUrl":null,"url":null,"abstract":"Job scheduling in cluster is often considered as a difficult online decision-making problem, and its solution depends largely on the understanding of the workload and environment. People usually first propose a simple heuristic scheduling algorithm, and then perform repeated and tedious manual tests and adjustments based on the characteristics of the workload to gradually improve the algorithm. In this work, focusing on multi-cluster environments, load balancing and efficient scheduling, we present RLSK, a deep reinforcement learning based job scheduler for scheduling independent batch jobs among multiple federated cloud computing clusters adaptively. By directly specifying high-level scheduling targets, RLSK interacts with the system environment and automatically learns scheduling strategies from experience without any prior knowledge assumed over the underlying multi-cluster environment and human instructions, which avoids people’s tedious testing and tuning work. We implement our scheduler based on Kubernetes, and conduct simulations to evaluate the performance of our design. The results show that, RLSK can outperform traditional scheduling algorithms.","PeriodicalId":173494,"journal":{"name":"2020 IEEE International Conference on Cloud Engineering (IC2E)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning\",\"authors\":\"Jiaming Huang, C. Xiao, Weigang Wu\",\"doi\":\"10.1109/IC2E48712.2020.00019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Job scheduling in cluster is often considered as a difficult online decision-making problem, and its solution depends largely on the understanding of the workload and environment. People usually first propose a simple heuristic scheduling algorithm, and then perform repeated and tedious manual tests and adjustments based on the characteristics of the workload to gradually improve the algorithm. In this work, focusing on multi-cluster environments, load balancing and efficient scheduling, we present RLSK, a deep reinforcement learning based job scheduler for scheduling independent batch jobs among multiple federated cloud computing clusters adaptively. By directly specifying high-level scheduling targets, RLSK interacts with the system environment and automatically learns scheduling strategies from experience without any prior knowledge assumed over the underlying multi-cluster environment and human instructions, which avoids people’s tedious testing and tuning work. We implement our scheduler based on Kubernetes, and conduct simulations to evaluate the performance of our design. The results show that, RLSK can outperform traditional scheduling algorithms.\",\"PeriodicalId\":173494,\"journal\":{\"name\":\"2020 IEEE International Conference on Cloud Engineering (IC2E)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Cloud Engineering (IC2E)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC2E48712.2020.00019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E48712.2020.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning
Job scheduling in cluster is often considered as a difficult online decision-making problem, and its solution depends largely on the understanding of the workload and environment. People usually first propose a simple heuristic scheduling algorithm, and then perform repeated and tedious manual tests and adjustments based on the characteristics of the workload to gradually improve the algorithm. In this work, focusing on multi-cluster environments, load balancing and efficient scheduling, we present RLSK, a deep reinforcement learning based job scheduler for scheduling independent batch jobs among multiple federated cloud computing clusters adaptively. By directly specifying high-level scheduling targets, RLSK interacts with the system environment and automatically learns scheduling strategies from experience without any prior knowledge assumed over the underlying multi-cluster environment and human instructions, which avoids people’s tedious testing and tuning work. We implement our scheduler based on Kubernetes, and conduct simulations to evaluate the performance of our design. The results show that, RLSK can outperform traditional scheduling algorithms.