多云网络中具有截止日期和不确定性能的 Spark 工作流任务的新型调度方法

IF 5.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Kamran Yaseen Rajput;Xiaoping Li;Jinquan Zhang;Abdullah Lakhan
{"title":"多云网络中具有截止日期和不确定性能的 Spark 工作流任务的新型调度方法","authors":"Kamran Yaseen Rajput;Xiaoping Li;Jinquan Zhang;Abdullah Lakhan","doi":"10.1109/TCC.2024.3449771","DOIUrl":null,"url":null,"abstract":"These days, the usage of cloud computing services for different applications has been growing progressively. The applications, including business, commerce, healthcare, and others, require additional computation capabilities for their executions. To fulfil their expanding computational demands, cloud computing offers a pay-as-you-go billing model to run these applications cost-effectively. However, due to the complex requirements of these applications, more than one cloud system is required because single-cloud solutions are often limited by resource constraints, such as inadequate storage and computing power, as well as single-point failures that can compromise the integrity of the entire application. Consequently, multi-cloud strategies, which provide more scalable storage and computing resources, are becoming increasingly popular. However, the multi-cloud landscape consists of many cloud providers, and effectively managing workflow scheduling presents a significant hurdle in this dynamic environment. This paper focuses on scheduling Spark workflow tasks in multi-cloud networks. It addresses the challenges posed by different pricing models, dynamic resource provisioning, inter- and intra-transmission time, and the instability of resource performance. To solve these challenges, we propose a novel heuristic-based approach that considers different constraints such as VM instances heterogeneity, priority constraints, transmission times, and the impact of performance uncertainty. The goal is to schedule all tasks on virtual machines (VMs) with rental costs as low as possible while meeting workflow deadlines. The simulation results show that the proposed method effectively schedules Spark workflow tasks in multi-cloud networks, improving the scheduling performance by 50% compared to existing approaches.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"12 4","pages":"1145-1157"},"PeriodicalIF":5.3000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Scheduling Approach for Spark Workflow Tasks With Deadline and Uncertain Performance in Multi-Cloud Networks\",\"authors\":\"Kamran Yaseen Rajput;Xiaoping Li;Jinquan Zhang;Abdullah Lakhan\",\"doi\":\"10.1109/TCC.2024.3449771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"These days, the usage of cloud computing services for different applications has been growing progressively. The applications, including business, commerce, healthcare, and others, require additional computation capabilities for their executions. To fulfil their expanding computational demands, cloud computing offers a pay-as-you-go billing model to run these applications cost-effectively. However, due to the complex requirements of these applications, more than one cloud system is required because single-cloud solutions are often limited by resource constraints, such as inadequate storage and computing power, as well as single-point failures that can compromise the integrity of the entire application. Consequently, multi-cloud strategies, which provide more scalable storage and computing resources, are becoming increasingly popular. However, the multi-cloud landscape consists of many cloud providers, and effectively managing workflow scheduling presents a significant hurdle in this dynamic environment. This paper focuses on scheduling Spark workflow tasks in multi-cloud networks. It addresses the challenges posed by different pricing models, dynamic resource provisioning, inter- and intra-transmission time, and the instability of resource performance. To solve these challenges, we propose a novel heuristic-based approach that considers different constraints such as VM instances heterogeneity, priority constraints, transmission times, and the impact of performance uncertainty. The goal is to schedule all tasks on virtual machines (VMs) with rental costs as low as possible while meeting workflow deadlines. The simulation results show that the proposed method effectively schedules Spark workflow tasks in multi-cloud networks, improving the scheduling performance by 50% compared to existing approaches.\",\"PeriodicalId\":13202,\"journal\":{\"name\":\"IEEE Transactions on Cloud Computing\",\"volume\":\"12 4\",\"pages\":\"1145-1157\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cloud Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10646491/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10646491/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

如今,不同应用程序对云计算服务的使用正在逐步增长。应用程序(包括业务、商业、医疗保健等)的执行需要额外的计算能力。为了满足不断扩展的计算需求,云计算提供了一种按需付费的计费模式,以经济有效地运行这些应用程序。然而,由于这些应用程序的复杂需求,需要多个云系统,因为单个云解决方案通常受到资源约束的限制,例如存储和计算能力不足,以及可能危及整个应用程序完整性的单点故障。因此,提供更多可伸缩存储和计算资源的多云策略正变得越来越流行。然而,多云环境由许多云提供商组成,在这个动态环境中,有效地管理工作流调度是一个重大障碍。本文主要研究多云网络下Spark工作流任务的调度问题。它解决了不同定价模型、动态资源供应、传输间和传输内时间以及资源性能不稳定性所带来的挑战。为了解决这些挑战,我们提出了一种新的基于启发式的方法,该方法考虑了不同的约束,如VM实例的异质性、优先级约束、传输时间和性能不确定性的影响。目标是在满足工作流截止日期的情况下,以尽可能低的租用成本在虚拟机上安排所有任务。仿真结果表明,该方法能有效地调度多云网络中的Spark工作流任务,调度性能比现有方法提高50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Novel Scheduling Approach for Spark Workflow Tasks With Deadline and Uncertain Performance in Multi-Cloud Networks
These days, the usage of cloud computing services for different applications has been growing progressively. The applications, including business, commerce, healthcare, and others, require additional computation capabilities for their executions. To fulfil their expanding computational demands, cloud computing offers a pay-as-you-go billing model to run these applications cost-effectively. However, due to the complex requirements of these applications, more than one cloud system is required because single-cloud solutions are often limited by resource constraints, such as inadequate storage and computing power, as well as single-point failures that can compromise the integrity of the entire application. Consequently, multi-cloud strategies, which provide more scalable storage and computing resources, are becoming increasingly popular. However, the multi-cloud landscape consists of many cloud providers, and effectively managing workflow scheduling presents a significant hurdle in this dynamic environment. This paper focuses on scheduling Spark workflow tasks in multi-cloud networks. It addresses the challenges posed by different pricing models, dynamic resource provisioning, inter- and intra-transmission time, and the instability of resource performance. To solve these challenges, we propose a novel heuristic-based approach that considers different constraints such as VM instances heterogeneity, priority constraints, transmission times, and the impact of performance uncertainty. The goal is to schedule all tasks on virtual machines (VMs) with rental costs as low as possible while meeting workflow deadlines. The simulation results show that the proposed method effectively schedules Spark workflow tasks in multi-cloud networks, improving the scheduling performance by 50% compared to existing approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Cloud Computing
IEEE Transactions on Cloud Computing Computer Science-Software
CiteScore
9.40
自引率
6.20%
发文量
167
期刊介绍: The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信