Taha Abdelazziz Rahmani, Ghalem Belalem, Sidi Ahmed Mahmoudi, Omar Rafik Merad-Boudia
{"title":"Equalizer: Energy-efficient machine learning-based heterogeneous cluster load balancer","authors":"Taha Abdelazziz Rahmani, Ghalem Belalem, Sidi Ahmed Mahmoudi, Omar Rafik Merad-Boudia","doi":"10.1002/cpe.8230","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Heterogeneous systems deliver high computing performance when effectively utilized. It is crucial to execute each application on the most suitable device while maintaining system balance. However, achieving equal distribution of the computing load is challenging due to variations in computing power and device architectures within the system. Moreover, scheduling applications at real-time further complicates this task, as prior information about the submitted applications is absent. In this context, we introduce “Equalizer,” a real-time load balancer for heterogeneous systems. “Equalizer” leverages machine learning to continuously monitor the system's state, predicting optimal devices for application execution at runtime. It assigns applications to devices that minimize system imbalance. To quantify system imbalance, we propose a novel metric that reflects the disparity in computing loads across the system's devices. This metric is calculated using predicted execution times of applications. To validate the performance of “Equalizer,” we conducted a comparative study against widely adopted approaches, namely Round Robin and Device Suitability. The experiments were performed on a heterogeneous cluster comprising a master host and three slave servers, equipped with a total of 4 central processing units (CPUs) and 4 graphics processing units (GPUs). All approaches were deployed on the cluster and evaluated using three distinct workloads categorized by their computing intensity: medium intensity, heavy intensity, and a combination of heavy and medium intensity, simulating real-world scenarios. Each workload consisted of a set of 80 OpenCL applications with varying input data sizes. The experimental results demonstrate that “Equalizer” effectively minimized the system's imbalance, reduced the idle time of devices, and eliminated overloads. Moreover, “Equalizer” exhibited significant improvements in workload execution time, resource utilization, throughput, and energy consumption. Across all tested scenarios, “Equalizer” consistently outperformed alternative approaches, showcasing its robustness, adaptability to dynamic environments, and applicability in real-world practice.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"36 23","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.8230","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Heterogeneous systems deliver high computing performance when effectively utilized. It is crucial to execute each application on the most suitable device while maintaining system balance. However, achieving equal distribution of the computing load is challenging due to variations in computing power and device architectures within the system. Moreover, scheduling applications at real-time further complicates this task, as prior information about the submitted applications is absent. In this context, we introduce “Equalizer,” a real-time load balancer for heterogeneous systems. “Equalizer” leverages machine learning to continuously monitor the system's state, predicting optimal devices for application execution at runtime. It assigns applications to devices that minimize system imbalance. To quantify system imbalance, we propose a novel metric that reflects the disparity in computing loads across the system's devices. This metric is calculated using predicted execution times of applications. To validate the performance of “Equalizer,” we conducted a comparative study against widely adopted approaches, namely Round Robin and Device Suitability. The experiments were performed on a heterogeneous cluster comprising a master host and three slave servers, equipped with a total of 4 central processing units (CPUs) and 4 graphics processing units (GPUs). All approaches were deployed on the cluster and evaluated using three distinct workloads categorized by their computing intensity: medium intensity, heavy intensity, and a combination of heavy and medium intensity, simulating real-world scenarios. Each workload consisted of a set of 80 OpenCL applications with varying input data sizes. The experimental results demonstrate that “Equalizer” effectively minimized the system's imbalance, reduced the idle time of devices, and eliminated overloads. Moreover, “Equalizer” exhibited significant improvements in workload execution time, resource utilization, throughput, and energy consumption. Across all tested scenarios, “Equalizer” consistently outperformed alternative approaches, showcasing its robustness, adaptability to dynamic environments, and applicability in real-world practice.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.