Saeed Akbar;Ubaid Ul Akbar;Rahmat Ullah;Zhonglong Zheng
{"title":"A Coalitional Game-Based Adaptive Scheduler Leveraging Task Heterogeneity for Greener Data Centers","authors":"Saeed Akbar;Ubaid Ul Akbar;Rahmat Ullah;Zhonglong Zheng","doi":"10.1109/TGCN.2024.3414671","DOIUrl":null,"url":null,"abstract":"Managing power and its subsequent thermal implications is of paramount concern in modern Data Centers (DCs) management. Failure to adequately address the escalating energy use can result in excessive heat dissipation, leading to thermal imbalances and hotspots. In addition, the prolonged execution of CPU-intensive user jobs on servers operating at higher temperatures can significantly aggravate the DCs cooling efforts. Researchers advocate Thermal-aware (TA) scheduling as a promising tool to counter the said issue. However, existing state-of-the-art overlooks user jobs runtime heterogeneity, potentially causing aggravated heat dissipation when CPU-intensive tasks run on servers at elevated temperatures for longer duration. Moreover, existing works do not provide any mechanism to detect overloaded computing nodes at runtime in a TA context. Finally, existing strategies do not adapt according to the DCs dynamic thermal conditions. This paper offers a Coalitional Game-based Thermal-aware Adaptive Scheduling (CGTAS) tailored for heterogeneous DCs to minimize the cooling cost stemming from excessive heat generated during compute-intensive job execution. CGTAS intelligently differentiates incoming jobs based on their thermal profiles and CPU-time for optimal thermal outcomes. In addition, it dynamically allocates user jobs to computing nodes based on their real-time marginal thermal performance using the Core solution concept from game theory. Finally, unlike existing TA strategies, the proposed design identifies thermally overloaded computing elements using the Core and performs task migrations to optimize thermal-efficiency. Extensive simulations confirm substantial energy savings (up to 26.08%) compared to its TA substitutes, promoting sustainable and high-performance computing infrastructure in large-scale cloud DCs.","PeriodicalId":13052,"journal":{"name":"IEEE Transactions on Green Communications and Networking","volume":"9 1","pages":"55-69"},"PeriodicalIF":5.3000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Green Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10557634/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Managing power and its subsequent thermal implications is of paramount concern in modern Data Centers (DCs) management. Failure to adequately address the escalating energy use can result in excessive heat dissipation, leading to thermal imbalances and hotspots. In addition, the prolonged execution of CPU-intensive user jobs on servers operating at higher temperatures can significantly aggravate the DCs cooling efforts. Researchers advocate Thermal-aware (TA) scheduling as a promising tool to counter the said issue. However, existing state-of-the-art overlooks user jobs runtime heterogeneity, potentially causing aggravated heat dissipation when CPU-intensive tasks run on servers at elevated temperatures for longer duration. Moreover, existing works do not provide any mechanism to detect overloaded computing nodes at runtime in a TA context. Finally, existing strategies do not adapt according to the DCs dynamic thermal conditions. This paper offers a Coalitional Game-based Thermal-aware Adaptive Scheduling (CGTAS) tailored for heterogeneous DCs to minimize the cooling cost stemming from excessive heat generated during compute-intensive job execution. CGTAS intelligently differentiates incoming jobs based on their thermal profiles and CPU-time for optimal thermal outcomes. In addition, it dynamically allocates user jobs to computing nodes based on their real-time marginal thermal performance using the Core solution concept from game theory. Finally, unlike existing TA strategies, the proposed design identifies thermally overloaded computing elements using the Core and performs task migrations to optimize thermal-efficiency. Extensive simulations confirm substantial energy savings (up to 26.08%) compared to its TA substitutes, promoting sustainable and high-performance computing infrastructure in large-scale cloud DCs.