HierRL: Hierarchical Reinforcement Learning for Task Scheduling in Distributed Systems

2022 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2022-07-18 DOI:10.1109/IJCNN55064.2022.9892507

Yanxia Guan, Yuntao Liu, Yuan Li, Xinhai Xu

{"title":"HierRL: Hierarchical Reinforcement Learning for Task Scheduling in Distributed Systems","authors":"Yanxia Guan, Yuntao Liu, Yuan Li, Xinhai Xu","doi":"10.1109/IJCNN55064.2022.9892507","DOIUrl":null,"url":null,"abstract":"The distributed system Ray has attracted much attention for many decision-making applications. It provides a flexible and powerful distributed running mechanism for the training of the learning algorithms, which could map the computation tasks to the resources automatically. Task scheduling is a critical component in Ray, adopting a two-layer structure. It uses a simple general scheduling principle, which leaves much space to optimize. In this paper, we will study the two-layer scheduling problem in Ray, setting it as an optimization problem. We firstly present a comprehensive formulation for the problem and point out that it is a NP-hard problem. Then we design a hierarchical reinforcement learning method, named HierRL, which consists of a high-level agent and a low-level agent. Sophisticated state space, action space, and reward function are designed for this method. In the high level, we devise a value-based reinforcement learning method, which allocates a task to an appropriate node of the low level. With tasks allocated from the high level and generated from applications, a low-level reinforcement learning method is constructed to select tasks from the queue to be executed. A hierarchical policy learning method is introduced for the training of the two-layer agents. Finally, we simulate the two-layer scheduling procedure in a public platform, Cloudsim, with tasks from a real Dataset generated by the Alibaba Cluster Trace Program. The results show that the proposed method performs much better than the original scheduling method of Ray.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The distributed system Ray has attracted much attention for many decision-making applications. It provides a flexible and powerful distributed running mechanism for the training of the learning algorithms, which could map the computation tasks to the resources automatically. Task scheduling is a critical component in Ray, adopting a two-layer structure. It uses a simple general scheduling principle, which leaves much space to optimize. In this paper, we will study the two-layer scheduling problem in Ray, setting it as an optimization problem. We firstly present a comprehensive formulation for the problem and point out that it is a NP-hard problem. Then we design a hierarchical reinforcement learning method, named HierRL, which consists of a high-level agent and a low-level agent. Sophisticated state space, action space, and reward function are designed for this method. In the high level, we devise a value-based reinforcement learning method, which allocates a task to an appropriate node of the low level. With tasks allocated from the high level and generated from applications, a low-level reinforcement learning method is constructed to select tasks from the queue to be executed. A hierarchical policy learning method is introduced for the training of the two-layer agents. Finally, we simulate the two-layer scheduling procedure in a public platform, Cloudsim, with tasks from a real Dataset generated by the Alibaba Cluster Trace Program. The results show that the proposed method performs much better than the original scheduling method of Ray.

查看原文本刊更多论文

分布式系统中任务调度的分层强化学习

分布式系统Ray在许多决策应用中受到了广泛的关注。它为学习算法的训练提供了一种灵活而强大的分布式运行机制，可以自动将计算任务映射到资源上。任务调度是Ray的关键组成部分，采用两层结构。它使用了一个简单的通用调度原则，为优化留出了很大的空间。本文将研究Ray中的双层调度问题，并将其作为一个优化问题。我们首先给出了这个问题的一个综合表述，并指出它是一个np困难问题。然后，我们设计了一种分层强化学习方法，称为HierRL，它由一个高级智能体和一个低级智能体组成。该方法设计了复杂的状态空间、动作空间和奖励函数。在高层，我们设计了一种基于值的强化学习方法，该方法将任务分配给低层的适当节点。通过从高层分配任务并从应用程序生成任务，构建了一种低级强化学习方法来从队列中选择要执行的任务。提出了一种分层策略学习方法，用于两层智能体的训练。最后，我们利用阿里集群跟踪程序生成的真实数据集中的任务，在公共平台Cloudsim中模拟了双层调度过程。结果表明，该方法的调度性能明显优于原有的Ray调度方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量