Rui Chen , Weiwei Lin , Huikang Huang , Xiaoying Ye , Zhiping Peng
{"title":"GAS-MARL: Green-Aware job Scheduling algorithm for HPC clusters based on Multi-Action Deep Reinforcement Learning","authors":"Rui Chen , Weiwei Lin , Huikang Huang , Xiaoying Ye , Zhiping Peng","doi":"10.1016/j.future.2025.107760","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the computational power of High-Performance Computing (HPC) clusters has surged. However, amidst global calls for energy conservation and emission reduction, their rapid power consumption poses a developmental bottleneck. Adopting renewable energy sources for power supply is a crucial measure to reduce carbon emissions from HPC clusters. However, due to the variability and intermittency of renewable energy, formulating effective job scheduling plans to fully utilize these sources has become urgent. To tackle this, we propose a Green-Aware job Scheduling algorithm for HPC clusters based on Multi-Action Deep Reinforcement Learning (GAS-MARL), which optimizes both renewable energy utilization and average bounded slowdown. In this algorithm, the agent outputs two actions during one decision-making period: job selection action and delay decision action. The introduction of delay decision actions enhances the flexibility of the scheduling algorithm, enabling each job to be executed during appropriate time slots. Furthermore, we have designed a new backfilling policy called Green-Backfilling to better cooperate with GAS-MARL for job scheduling. Experimental evaluations demonstrate that, compared to other algorithms, the combination of GAS-MARL and Green-Backfilling exhibits significant advantages in enhancing renewable energy utilization and decreasing average bounded slowdown.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"167 ","pages":"Article 107760"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X2500055X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
GAS-MARL: Green-Aware job Scheduling algorithm for HPC clusters based on Multi-Action Deep Reinforcement Learning
In recent years, the computational power of High-Performance Computing (HPC) clusters has surged. However, amidst global calls for energy conservation and emission reduction, their rapid power consumption poses a developmental bottleneck. Adopting renewable energy sources for power supply is a crucial measure to reduce carbon emissions from HPC clusters. However, due to the variability and intermittency of renewable energy, formulating effective job scheduling plans to fully utilize these sources has become urgent. To tackle this, we propose a Green-Aware job Scheduling algorithm for HPC clusters based on Multi-Action Deep Reinforcement Learning (GAS-MARL), which optimizes both renewable energy utilization and average bounded slowdown. In this algorithm, the agent outputs two actions during one decision-making period: job selection action and delay decision action. The introduction of delay decision actions enhances the flexibility of the scheduling algorithm, enabling each job to be executed during appropriate time slots. Furthermore, we have designed a new backfilling policy called Green-Backfilling to better cooperate with GAS-MARL for job scheduling. Experimental evaluations demonstrate that, compared to other algorithms, the combination of GAS-MARL and Green-Backfilling exhibits significant advantages in enhancing renewable energy utilization and decreasing average bounded slowdown.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.