云计算环境中成本优化工作流调度的深度强化学习方法

arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2024-08-06 DOI:arxiv-2408.02926

Amanda Jayanetti, Saman Halgamuge, Rajkumar Buyya

{"title":"云计算环境中成本优化工作流调度的深度强化学习方法","authors":"Amanda Jayanetti, Saman Halgamuge, Rajkumar Buyya","doi":"arxiv-2408.02926","DOIUrl":null,"url":null,"abstract":"Cost optimization is a common goal of workflow schedulers operating in cloud\ncomputing environments. The use of spot instances is a potential means of\nachieving this goal, as they are offered by cloud providers at discounted\nprices compared to their on-demand counterparts in exchange for reduced\nreliability. This is due to the fact that spot instances are subjected to\ninterruptions when spare computing capacity used for provisioning them is\nneeded back owing to demand variations. Also, the prices of spot instances are\nnot fixed as pricing is dependent on long term supply and demand. The\npossibility of interruptions and pricing variations associated with spot\ninstances adds a layer of uncertainty to the general problem of workflow\nscheduling across cloud computing environments. These challenges need to be\nefficiently addressed for enjoying the cost savings achievable with the use of\nspot instances without compromising the underlying business requirements. To\nthis end, in this paper we use Deep Reinforcement Learning for developing an\nautonomous agent capable of scheduling workflows in a cost efficient manner by\nusing an intelligent mix of spot and on-demand instances. The proposed solution\nis implemented in the open source container native Argo workflow engine that is\nwidely used for executing industrial workflows. The results of the experiments\ndemonstrate that the proposed scheduling method is capable of outperforming the\ncurrent benchmarks.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Reinforcement Learning Approach for Cost Optimized Workflow Scheduling in Cloud Computing Environments\",\"authors\":\"Amanda Jayanetti, Saman Halgamuge, Rajkumar Buyya\",\"doi\":\"arxiv-2408.02926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cost optimization is a common goal of workflow schedulers operating in cloud\\ncomputing environments. The use of spot instances is a potential means of\\nachieving this goal, as they are offered by cloud providers at discounted\\nprices compared to their on-demand counterparts in exchange for reduced\\nreliability. This is due to the fact that spot instances are subjected to\\ninterruptions when spare computing capacity used for provisioning them is\\nneeded back owing to demand variations. Also, the prices of spot instances are\\nnot fixed as pricing is dependent on long term supply and demand. The\\npossibility of interruptions and pricing variations associated with spot\\ninstances adds a layer of uncertainty to the general problem of workflow\\nscheduling across cloud computing environments. These challenges need to be\\nefficiently addressed for enjoying the cost savings achievable with the use of\\nspot instances without compromising the underlying business requirements. To\\nthis end, in this paper we use Deep Reinforcement Learning for developing an\\nautonomous agent capable of scheduling workflows in a cost efficient manner by\\nusing an intelligent mix of spot and on-demand instances. The proposed solution\\nis implemented in the open source container native Argo workflow engine that is\\nwidely used for executing industrial workflows. The results of the experiments\\ndemonstrate that the proposed scheduling method is capable of outperforming the\\ncurrent benchmarks.\",\"PeriodicalId\":501422,\"journal\":{\"name\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"volume\":\"67 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.02926\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

成本优化是在云计算环境中运行的工作流调度程序的共同目标。使用现成实例是实现这一目标的一个潜在手段，因为云提供商以低于按需实例的价格提供现成实例，以换取较低的可靠性。这是因为，当由于需求变化而需要恢复用于供应现货实例的备用计算能力时，现货实例会出现中断。此外，现货实例的价格并不固定，因为定价取决于长期供求关系。与现货实例相关的中断和价格变化的可能性给跨云计算环境的工作流调度这一普遍问题增加了一层不确定性。要在不影响基本业务需求的情况下享受使用现货实例所带来的成本节约，就必须充分应对这些挑战。为此，我们在本文中使用深度强化学习技术开发了一个自主代理，该代理能够通过使用现货和按需实例的智能组合，以具有成本效益的方式调度工作流。提出的解决方案在开源容器原生 Argo 工作流引擎中实现，该引擎广泛用于执行工业工作流。实验结果表明，所提出的调度方法能够超越当前的基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Deep Reinforcement Learning Approach for Cost Optimized Workflow Scheduling in Cloud Computing Environments

Cost optimization is a common goal of workflow schedulers operating in cloud computing environments. The use of spot instances is a potential means of achieving this goal, as they are offered by cloud providers at discounted prices compared to their on-demand counterparts in exchange for reduced reliability. This is due to the fact that spot instances are subjected to interruptions when spare computing capacity used for provisioning them is needed back owing to demand variations. Also, the prices of spot instances are not fixed as pricing is dependent on long term supply and demand. The possibility of interruptions and pricing variations associated with spot instances adds a layer of uncertainty to the general problem of workflow scheduling across cloud computing environments. These challenges need to be efficiently addressed for enjoying the cost savings achievable with the use of spot instances without compromising the underlying business requirements. To this end, in this paper we use Deep Reinforcement Learning for developing an autonomous agent capable of scheduling workflows in a cost efficient manner by using an intelligent mix of spot and on-demand instances. The proposed solution is implemented in the open source container native Argo workflow engine that is widely used for executing industrial workflows. The results of the experiments demonstrate that the proposed scheduling method is capable of outperforming the current benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Distributed, Parallel, and Cluster Computing

自引率

0.00%

发文量