类mapreduce系统中单个作业的推测执行

2014 IEEE 7th International Conference on Cloud Computing Pub Date : 2014-06-27 DOI:10.1109/CLOUD.2014.84

Huanle Xu, W. Lau

{"title":"类mapreduce系统中单个作业的推测执行","authors":"Huanle Xu, W. Lau","doi":"10.1109/CLOUD.2014.84","DOIUrl":null,"url":null,"abstract":"Parallel processing plays an important role for large-scale data analytics. It breaks a job into many small tasks which run parallel on multiple machines such as MapReduce framework. One fundamental challenge faced to such parallel processing is the straggling tasks as they can delay the completion of a job seriously. In this paper, we focus on the speculative execution issue which is used to deal with the straggling problem in the literature. We present a theoretical framework for the optimization of a single job which differs a lot from the previous heuristics-based work. More precisely, we propose two schemes when the number of parallel tasks the job consists of is smaller than cluster size. In the first scheme, no monitoring is needed and we can provide the job deadline guarantee with a high probability while achieve the optimal resource consumption level. The second scheme needs to monitor the task progress and makes the optimal number of duplicates when the straggling problem happens. On the other hand, when the number of tasks in a job is larger than the cluster size, we propose an Enhanced Speculative Execution (ESE) algorithm to make the optimal decision whenever a machine is available for a new scheduling. The simulation results show the ESE algorithm can reduce the job flow time by 50% while consume fewer resources comparing to the strategy without backup.","PeriodicalId":288542,"journal":{"name":"2014 IEEE 7th International Conference on Cloud Computing","volume":"412 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Speculative Execution for a Single Job in a MapReduce-Like System\",\"authors\":\"Huanle Xu, W. Lau\",\"doi\":\"10.1109/CLOUD.2014.84\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parallel processing plays an important role for large-scale data analytics. It breaks a job into many small tasks which run parallel on multiple machines such as MapReduce framework. One fundamental challenge faced to such parallel processing is the straggling tasks as they can delay the completion of a job seriously. In this paper, we focus on the speculative execution issue which is used to deal with the straggling problem in the literature. We present a theoretical framework for the optimization of a single job which differs a lot from the previous heuristics-based work. More precisely, we propose two schemes when the number of parallel tasks the job consists of is smaller than cluster size. In the first scheme, no monitoring is needed and we can provide the job deadline guarantee with a high probability while achieve the optimal resource consumption level. The second scheme needs to monitor the task progress and makes the optimal number of duplicates when the straggling problem happens. On the other hand, when the number of tasks in a job is larger than the cluster size, we propose an Enhanced Speculative Execution (ESE) algorithm to make the optimal decision whenever a machine is available for a new scheduling. The simulation results show the ESE algorithm can reduce the job flow time by 50% while consume fewer resources comparing to the strategy without backup.\",\"PeriodicalId\":288542,\"journal\":{\"name\":\"2014 IEEE 7th International Conference on Cloud Computing\",\"volume\":\"412 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 7th International Conference on Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD.2014.84\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 7th International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD.2014.84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

并行处理在大规模数据分析中起着重要的作用。它将一个作业分解成许多小任务，这些任务在多台机器上并行运行，比如MapReduce框架。这种并行处理面临的一个基本挑战是分散的任务，因为它们可能严重延迟任务的完成。本文主要研究了文献中用于处理离散问题的推测执行问题。我们提出了一个理论框架的优化一个单一的工作，不同于以往的启发式为基础的工作。更准确地说，我们提出了两种方案，当作业由并行任务组成的数量小于集群大小时。在第一种方案中，我们不需要监控，在达到最优资源消耗水平的同时，我们可以提供高概率的作业期限保证。第二种方案需要监控任务进度，并在出现散列问题时获得最优的副本数量。另一方面，当作业中的任务数量大于集群大小时，我们提出了一种增强推测执行(Enhanced Speculative Execution, ESE)算法，以便在机器可用于新调度时做出最优决策。仿真结果表明，与无备份策略相比，ESE算法可将作业流程时间缩短50%，同时消耗的资源更少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speculative Execution for a Single Job in a MapReduce-Like System

Parallel processing plays an important role for large-scale data analytics. It breaks a job into many small tasks which run parallel on multiple machines such as MapReduce framework. One fundamental challenge faced to such parallel processing is the straggling tasks as they can delay the completion of a job seriously. In this paper, we focus on the speculative execution issue which is used to deal with the straggling problem in the literature. We present a theoretical framework for the optimization of a single job which differs a lot from the previous heuristics-based work. More precisely, we propose two schemes when the number of parallel tasks the job consists of is smaller than cluster size. In the first scheme, no monitoring is needed and we can provide the job deadline guarantee with a high probability while achieve the optimal resource consumption level. The second scheme needs to monitor the task progress and makes the optimal number of duplicates when the straggling problem happens. On the other hand, when the number of tasks in a job is larger than the cluster size, we propose an Enhanced Speculative Execution (ESE) algorithm to make the optimal decision whenever a machine is available for a new scheduling. The simulation results show the ESE algorithm can reduce the job flow time by 50% while consume fewer resources comparing to the strategy without backup.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 IEEE 7th International Conference on Cloud Computing

自引率

0.00%

发文量