异构云计算框架中具有最小任务故障感知的任务划分模型的能量截止日期优化

IF 4 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-05-17 DOI:10.1016/j.compeleceng.2025.110438

KN Divyaprabha, TSB Sudarshan

{"title":"异构云计算框架中具有最小任务故障感知的任务划分模型的能量截止日期优化","authors":"KN Divyaprabha, TSB Sudarshan","doi":"10.1016/j.compeleceng.2025.110438","DOIUrl":null,"url":null,"abstract":"<div><div>The central processing Unit (CPU) and graphical processing unit (GPU) will be used in high-performance computing (HPC) to provide scalable and effective computing paradigms for data-intensive scientific workloads. Nonetheless, energy use is a significant aspect that should be considered due to rising operational costs and green computing standards. Scientific workload scheduling is a challenging task since heterogeneous cloud computing (HCC) infrastructures consume more energy, which raises carbon emissions and lowers the reliability of the infrastructures. Although using the dynamic voltage-frequency scaling (DVFS) approach can improve the energy management of cloud infrastructure, it also decreases dependability and increases the error rate of workload scheduling on a CPU-GPU HCC architecture; thus, reducing task failure and minimizing energy are core issues that the current work addresses. The work first introduces the energy-deadline-aware task scheduling optimization (EDATSO) technique; secondly, it introduces the task-failure minimization-aware optimal scheduling (TFMOS) technique for the execution of scientific workflows. Simulation study demonstrates EDATSO reduces energy usage by 40.3 %, and 33.12 %, reduces makespan by 90.35 %, and 53.56 %, and overhead of additional energies used due to task failures by 95.56 %, 87.59 % as compared to energy minimized scheduling (EMS), multi-objective prioritized workflow scheduling through deep reinforcement learning (MOPWSDRL) for realistic scientific workloads, respectively. Further, TFMOS reduces energy usage by 40.33 %, and 46.4 %, reduces makespan by 90.4 %, and 53.95 %, and overhead of additional energies used due to task failures by 95.58 %, 87.61 % as compared to EMS, MOPWSDRL for realistic scientific workloads, respectively.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"125 ","pages":"Article 110438"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Energy-deadline optimization with minimal task failure aware task partitioning model in heterogeneous cloud computing framework\",\"authors\":\"KN Divyaprabha, TSB Sudarshan\",\"doi\":\"10.1016/j.compeleceng.2025.110438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The central processing Unit (CPU) and graphical processing unit (GPU) will be used in high-performance computing (HPC) to provide scalable and effective computing paradigms for data-intensive scientific workloads. Nonetheless, energy use is a significant aspect that should be considered due to rising operational costs and green computing standards. Scientific workload scheduling is a challenging task since heterogeneous cloud computing (HCC) infrastructures consume more energy, which raises carbon emissions and lowers the reliability of the infrastructures. Although using the dynamic voltage-frequency scaling (DVFS) approach can improve the energy management of cloud infrastructure, it also decreases dependability and increases the error rate of workload scheduling on a CPU-GPU HCC architecture; thus, reducing task failure and minimizing energy are core issues that the current work addresses. The work first introduces the energy-deadline-aware task scheduling optimization (EDATSO) technique; secondly, it introduces the task-failure minimization-aware optimal scheduling (TFMOS) technique for the execution of scientific workflows. Simulation study demonstrates EDATSO reduces energy usage by 40.3 %, and 33.12 %, reduces makespan by 90.35 %, and 53.56 %, and overhead of additional energies used due to task failures by 95.56 %, 87.59 % as compared to energy minimized scheduling (EMS), multi-objective prioritized workflow scheduling through deep reinforcement learning (MOPWSDRL) for realistic scientific workloads, respectively. Further, TFMOS reduces energy usage by 40.33 %, and 46.4 %, reduces makespan by 90.4 %, and 53.95 %, and overhead of additional energies used due to task failures by 95.58 %, 87.61 % as compared to EMS, MOPWSDRL for realistic scientific workloads, respectively.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"125 \",\"pages\":\"Article 110438\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625003817\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625003817","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

中央处理单元（CPU）和图形处理单元（GPU）将用于高性能计算（HPC），为数据密集型科学工作负载提供可扩展和有效的计算范例。尽管如此，由于运营成本和绿色计算标准的上升，能源使用是应该考虑的一个重要方面。由于异构云计算（HCC）基础设施消耗更多的能源，从而增加了碳排放，降低了基础设施的可靠性，科学的工作量调度是一项具有挑战性的任务。虽然使用动态电压频率缩放（DVFS）方法可以改善云基础设施的能源管理，但它也降低了CPU-GPU HCC架构的可靠性，增加了工作负载调度的错误率；因此，减少任务失败和最小化能量是当前工作要解决的核心问题。首先介绍了能量截止时间感知任务调度优化（EDATSO）技术；其次，介绍了科学工作流执行的任务故障最小化感知最优调度技术（TFMOS）。仿真研究表明，与能量最小化调度（EMS）、基于深度强化学习的多目标优先工作流调度（MOPWSDRL）相比，EDATSO可分别减少40.3%和33.12%的能耗，90.35%和53.56%的完工时间，以及95.56%和87.59%的由于任务失败而产生的额外能量开销。此外，与EMS和MOPWSDRL相比，TFMOS在实际科学工作负载中分别减少了40.33%和46.4%的能源消耗，减少了90.4%和53.95%的完工时间，并且由于任务失败而消耗的额外能量开销分别减少了95.58%和87.61%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Energy-deadline optimization with minimal task failure aware task partitioning model in heterogeneous cloud computing framework

The central processing Unit (CPU) and graphical processing unit (GPU) will be used in high-performance computing (HPC) to provide scalable and effective computing paradigms for data-intensive scientific workloads. Nonetheless, energy use is a significant aspect that should be considered due to rising operational costs and green computing standards. Scientific workload scheduling is a challenging task since heterogeneous cloud computing (HCC) infrastructures consume more energy, which raises carbon emissions and lowers the reliability of the infrastructures. Although using the dynamic voltage-frequency scaling (DVFS) approach can improve the energy management of cloud infrastructure, it also decreases dependability and increases the error rate of workload scheduling on a CPU-GPU HCC architecture; thus, reducing task failure and minimizing energy are core issues that the current work addresses. The work first introduces the energy-deadline-aware task scheduling optimization (EDATSO) technique; secondly, it introduces the task-failure minimization-aware optimal scheduling (TFMOS) technique for the execution of scientific workflows. Simulation study demonstrates EDATSO reduces energy usage by 40.3 %, and 33.12 %, reduces makespan by 90.35 %, and 53.56 %, and overhead of additional energies used due to task failures by 95.56 %, 87.59 % as compared to energy minimized scheduling (EMS), multi-objective prioritized workflow scheduling through deep reinforcement learning (MOPWSDRL) for realistic scientific workloads, respectively. Further, TFMOS reduces energy usage by 40.33 %, and 46.4 %, reduces makespan by 90.4 %, and 53.95 %, and overhead of additional energies used due to task failures by 95.58 %, 87.61 % as compared to EMS, MOPWSDRL for realistic scientific workloads, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.