{"title":"HierRL: Hierarchical Reinforcement Learning for Task Scheduling in Distributed Systems","authors":"Yanxia Guan, Yuntao Liu, Yuan Li, Xinhai Xu","doi":"10.1109/IJCNN55064.2022.9892507","DOIUrl":null,"url":null,"abstract":"The distributed system Ray has attracted much attention for many decision-making applications. It provides a flexible and powerful distributed running mechanism for the training of the learning algorithms, which could map the computation tasks to the resources automatically. Task scheduling is a critical component in Ray, adopting a two-layer structure. It uses a simple general scheduling principle, which leaves much space to optimize. In this paper, we will study the two-layer scheduling problem in Ray, setting it as an optimization problem. We firstly present a comprehensive formulation for the problem and point out that it is a NP-hard problem. Then we design a hierarchical reinforcement learning method, named HierRL, which consists of a high-level agent and a low-level agent. Sophisticated state space, action space, and reward function are designed for this method. In the high level, we devise a value-based reinforcement learning method, which allocates a task to an appropriate node of the low level. With tasks allocated from the high level and generated from applications, a low-level reinforcement learning method is constructed to select tasks from the queue to be executed. A hierarchical policy learning method is introduced for the training of the two-layer agents. Finally, we simulate the two-layer scheduling procedure in a public platform, Cloudsim, with tasks from a real Dataset generated by the Alibaba Cluster Trace Program. The results show that the proposed method performs much better than the original scheduling method of Ray.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The distributed system Ray has attracted much attention for many decision-making applications. It provides a flexible and powerful distributed running mechanism for the training of the learning algorithms, which could map the computation tasks to the resources automatically. Task scheduling is a critical component in Ray, adopting a two-layer structure. It uses a simple general scheduling principle, which leaves much space to optimize. In this paper, we will study the two-layer scheduling problem in Ray, setting it as an optimization problem. We firstly present a comprehensive formulation for the problem and point out that it is a NP-hard problem. Then we design a hierarchical reinforcement learning method, named HierRL, which consists of a high-level agent and a low-level agent. Sophisticated state space, action space, and reward function are designed for this method. In the high level, we devise a value-based reinforcement learning method, which allocates a task to an appropriate node of the low level. With tasks allocated from the high level and generated from applications, a low-level reinforcement learning method is constructed to select tasks from the queue to be executed. A hierarchical policy learning method is introduced for the training of the two-layer agents. Finally, we simulate the two-layer scheduling procedure in a public platform, Cloudsim, with tasks from a real Dataset generated by the Alibaba Cluster Trace Program. The results show that the proposed method performs much better than the original scheduling method of Ray.