A Novel Scheduling Approach for Spark Workflow Tasks With Deadline and Uncertain Performance in Multi-Cloud Networks

IF 5.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Kamran Yaseen Rajput;Xiaoping Li;Jinquan Zhang;Abdullah Lakhan
{"title":"A Novel Scheduling Approach for Spark Workflow Tasks With Deadline and Uncertain Performance in Multi-Cloud Networks","authors":"Kamran Yaseen Rajput;Xiaoping Li;Jinquan Zhang;Abdullah Lakhan","doi":"10.1109/TCC.2024.3449771","DOIUrl":null,"url":null,"abstract":"These days, the usage of cloud computing services for different applications has been growing progressively. The applications, including business, commerce, healthcare, and others, require additional computation capabilities for their executions. To fulfil their expanding computational demands, cloud computing offers a pay-as-you-go billing model to run these applications cost-effectively. However, due to the complex requirements of these applications, more than one cloud system is required because single-cloud solutions are often limited by resource constraints, such as inadequate storage and computing power, as well as single-point failures that can compromise the integrity of the entire application. Consequently, multi-cloud strategies, which provide more scalable storage and computing resources, are becoming increasingly popular. However, the multi-cloud landscape consists of many cloud providers, and effectively managing workflow scheduling presents a significant hurdle in this dynamic environment. This paper focuses on scheduling Spark workflow tasks in multi-cloud networks. It addresses the challenges posed by different pricing models, dynamic resource provisioning, inter- and intra-transmission time, and the instability of resource performance. To solve these challenges, we propose a novel heuristic-based approach that considers different constraints such as VM instances heterogeneity, priority constraints, transmission times, and the impact of performance uncertainty. The goal is to schedule all tasks on virtual machines (VMs) with rental costs as low as possible while meeting workflow deadlines. The simulation results show that the proposed method effectively schedules Spark workflow tasks in multi-cloud networks, improving the scheduling performance by 50% compared to existing approaches.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"12 4","pages":"1145-1157"},"PeriodicalIF":5.3000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10646491/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

These days, the usage of cloud computing services for different applications has been growing progressively. The applications, including business, commerce, healthcare, and others, require additional computation capabilities for their executions. To fulfil their expanding computational demands, cloud computing offers a pay-as-you-go billing model to run these applications cost-effectively. However, due to the complex requirements of these applications, more than one cloud system is required because single-cloud solutions are often limited by resource constraints, such as inadequate storage and computing power, as well as single-point failures that can compromise the integrity of the entire application. Consequently, multi-cloud strategies, which provide more scalable storage and computing resources, are becoming increasingly popular. However, the multi-cloud landscape consists of many cloud providers, and effectively managing workflow scheduling presents a significant hurdle in this dynamic environment. This paper focuses on scheduling Spark workflow tasks in multi-cloud networks. It addresses the challenges posed by different pricing models, dynamic resource provisioning, inter- and intra-transmission time, and the instability of resource performance. To solve these challenges, we propose a novel heuristic-based approach that considers different constraints such as VM instances heterogeneity, priority constraints, transmission times, and the impact of performance uncertainty. The goal is to schedule all tasks on virtual machines (VMs) with rental costs as low as possible while meeting workflow deadlines. The simulation results show that the proposed method effectively schedules Spark workflow tasks in multi-cloud networks, improving the scheduling performance by 50% compared to existing approaches.
多云网络中具有截止日期和不确定性能的 Spark 工作流任务的新型调度方法
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Cloud Computing
IEEE Transactions on Cloud Computing Computer Science-Software
CiteScore
9.40
自引率
6.20%
发文量
167
期刊介绍: The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信