Reliable workflow execution in distributed systems for cost efficiency

2010 11th IEEE/ACM International Conference on Grid Computing Pub Date : 2010-10-01 DOI:10.1109/GRID.2010.5697959

Young Choon Lee, Albert Y. Zomaya, Mazin S. Yousif

{"title":"Reliable workflow execution in distributed systems for cost efficiency","authors":"Young Choon Lee, Albert Y. Zomaya, Mazin S. Yousif","doi":"10.1109/GRID.2010.5697959","DOIUrl":null,"url":null,"abstract":"Reliability is of great practical importance in distributed computing systems (DCSs) due to its immediate impact on system performance, i.e., quality of service. The issue of reliability becomes more crucial particularly for ‘cost-conscious’ DCSs like grids and clouds. Unreliability brings about additional—often excessive—capital and operating costs. Resource failures are considered as the main source of unreliability in this study. In this study, we investigate the reliability of workflow execution in the context of scheduling and its effect on operating costs in DCSs, and present the reliability for profit assurance (RPA) algorithm as a novel workflow scheduling heuristic. The proposed RPA algorithm incorporates a (operating) cost-aware replication scheme to increase reliability. The incorporation of cost awareness greatly contributes to efficient replication decisions in terms of profitability. To the best of our knowledge, the work in this paper is the first attempt to explicitly take into account (monetary) reliability cost in workflow scheduling.","PeriodicalId":6372,"journal":{"name":"2010 11th IEEE/ACM International Conference on Grid Computing","volume":"24 1","pages":"89-96"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 11th IEEE/ACM International Conference on Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRID.2010.5697959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Reliability is of great practical importance in distributed computing systems (DCSs) due to its immediate impact on system performance, i.e., quality of service. The issue of reliability becomes more crucial particularly for ‘cost-conscious’ DCSs like grids and clouds. Unreliability brings about additional—often excessive—capital and operating costs. Resource failures are considered as the main source of unreliability in this study. In this study, we investigate the reliability of workflow execution in the context of scheduling and its effect on operating costs in DCSs, and present the reliability for profit assurance (RPA) algorithm as a novel workflow scheduling heuristic. The proposed RPA algorithm incorporates a (operating) cost-aware replication scheme to increase reliability. The incorporation of cost awareness greatly contributes to efficient replication decisions in terms of profitability. To the best of our knowledge, the work in this paper is the first attempt to explicitly take into account (monetary) reliability cost in workflow scheduling.

查看原文本刊更多论文

在分布式系统中可靠地执行工作流以提高成本效率

可靠性直接影响到系统的性能，即服务质量，因此在分布式计算系统中具有重要的实际意义。可靠性问题变得更加关键，特别是对于像电网和云这样“成本意识强”的dcs。不可靠性会带来额外的——通常是过高的——资金和运营成本。在本研究中，资源失效被认为是不可靠性的主要来源。在本研究中，我们研究了工作流执行的可靠性及其对dcs运行成本的影响，并提出了利润保证可靠性(RPA)算法作为一种新的工作流调度启发式算法。提出的RPA算法采用了一种(操作)成本感知的复制方案来提高可靠性。在盈利能力方面，成本意识的结合极大地有助于有效的复制决策。据我们所知，本文的工作是第一次尝试在工作流调度中明确考虑(货币)可靠性成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 11th IEEE/ACM International Conference on Grid Computing

自引率

0.00%

发文量