{"title":"Smart-mDAG: An Intelligent Scheduling Method for Multi-DAG Jobs","authors":"Yifan Zhu, Bo Hu","doi":"10.1109/ICTC52510.2021.9621176","DOIUrl":null,"url":null,"abstract":"Job scheduling is a fundamental problem in cloud data center, which plays an essential role in the makespan, resource utilization and maintenance of scheduling security, it has received widespread attention. With the rapid increase of jobs' amount, higher requirements are put forward for scheduling efficiency and makespan. Meantime dependencies between tasks are closely related to makespan and throughput, and these dependent tasks form multiple DAG structures. Heuristic algorithms have limitation on adjusting the scheduling policy according to diverse dependencies, thus resulting the extension of makespan. In this paper, we propose an intelligent scheduling method for multi-DAG jobs using deep reinforcement learning, called Smart-mDAG. It is a job-specific scheduling method that adjusts the scheduling policy based on diverse dependencies to minimize the makespan. Firstly, we convert dependencies to numeric form through a feature extraction module to obtain the dependent information from the DAG. Secondly, we use cascaded neural networks to implement the fusion of scheduling information, so we can obtain the fitness between machines and tasks. With Alibaba Cluster Data V2018, we evaluate the performance of Smart-mDAG on a five-machines small cluster. The result shows that compared to control algorithms, Smart-mDAG can shorten the makespan for 70% jobs, and the optimal makespan of single job can be decreased to 65% of the past.","PeriodicalId":299175,"journal":{"name":"2021 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"53 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC52510.2021.9621176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Job scheduling is a fundamental problem in cloud data center, which plays an essential role in the makespan, resource utilization and maintenance of scheduling security, it has received widespread attention. With the rapid increase of jobs' amount, higher requirements are put forward for scheduling efficiency and makespan. Meantime dependencies between tasks are closely related to makespan and throughput, and these dependent tasks form multiple DAG structures. Heuristic algorithms have limitation on adjusting the scheduling policy according to diverse dependencies, thus resulting the extension of makespan. In this paper, we propose an intelligent scheduling method for multi-DAG jobs using deep reinforcement learning, called Smart-mDAG. It is a job-specific scheduling method that adjusts the scheduling policy based on diverse dependencies to minimize the makespan. Firstly, we convert dependencies to numeric form through a feature extraction module to obtain the dependent information from the DAG. Secondly, we use cascaded neural networks to implement the fusion of scheduling information, so we can obtain the fitness between machines and tasks. With Alibaba Cluster Data V2018, we evaluate the performance of Smart-mDAG on a five-machines small cluster. The result shows that compared to control algorithms, Smart-mDAG can shorten the makespan for 70% jobs, and the optimal makespan of single job can be decreased to 65% of the past.