RAFTing MapReduce:在RAFT上快速恢复

2011 IEEE 27th International Conference on Data Engineering Pub Date : 2011-04-11 DOI:10.1109/ICDE.2011.5767877

Jorge-Arnulfo Quiané-Ruiz, C. Pinkel, Jörg Schad, J. Dittrich

{"title":"RAFTing MapReduce:在RAFT上快速恢复","authors":"Jorge-Arnulfo Quiané-Ruiz, C. Pinkel, Jörg Schad, J. Dittrich","doi":"10.1109/ICDE.2011.5767877","DOIUrl":null,"url":null,"abstract":"MapReduce is a computing paradigm that has gained a lot of popularity as it allows non-expert users to easily run complex analytical tasks at very large-scale. At such scale, task and node failures are no longer an exception but rather a characteristic of large-scale systems. This makes fault-tolerance a critical issue for the efficient operation of any application. MapReduce automatically reschedules failed tasks to available nodes, which in turn recompute such tasks from scratch. However, this policy can significantly decrease performance of applications. In this paper, we propose a family of Recovery Algorithms for Fast-Tracking (RAFT) MapReduce. As ease-of-use is a major feature of MapReduce, RAFT focuses on simplicity and also non-intrusiveness, in order to be implementation-independent. To efficiently recover from task failures, RAFT exploits the fact that MapReduce produces and persists intermediate results at several points in time. RAFT piggy-backs checkpoints on the task progress computation. To deal with multiple node failures, we propose query metadata checkpointing. We keep track of the mapping between input key-value pairs and intermediate data for all reduce tasks. Thereby, RAFT does not need to re-execute completed map tasks entirely. Instead RAFT only recomputes intermediate data that were processed for local reduce tasks and hence not shipped to another node for processing. We also introduce a scheduling strategy taking full advantage of these recovery algorithms. We implemented RAFT on top of Hadoop and evaluated it on a 45-node cluster using three common analytical tasks. Overall, our experimental results demonstrate that RAFT outperforms Hadoop runtimes by 23% on average under task and node failures. The results also show that RAFT has negligible runtime overhead.","PeriodicalId":332374,"journal":{"name":"2011 IEEE 27th International Conference on Data Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"83","resultStr":"{\"title\":\"RAFTing MapReduce: Fast recovery on the RAFT\",\"authors\":\"Jorge-Arnulfo Quiané-Ruiz, C. Pinkel, Jörg Schad, J. Dittrich\",\"doi\":\"10.1109/ICDE.2011.5767877\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MapReduce is a computing paradigm that has gained a lot of popularity as it allows non-expert users to easily run complex analytical tasks at very large-scale. At such scale, task and node failures are no longer an exception but rather a characteristic of large-scale systems. This makes fault-tolerance a critical issue for the efficient operation of any application. MapReduce automatically reschedules failed tasks to available nodes, which in turn recompute such tasks from scratch. However, this policy can significantly decrease performance of applications. In this paper, we propose a family of Recovery Algorithms for Fast-Tracking (RAFT) MapReduce. As ease-of-use is a major feature of MapReduce, RAFT focuses on simplicity and also non-intrusiveness, in order to be implementation-independent. To efficiently recover from task failures, RAFT exploits the fact that MapReduce produces and persists intermediate results at several points in time. RAFT piggy-backs checkpoints on the task progress computation. To deal with multiple node failures, we propose query metadata checkpointing. We keep track of the mapping between input key-value pairs and intermediate data for all reduce tasks. Thereby, RAFT does not need to re-execute completed map tasks entirely. Instead RAFT only recomputes intermediate data that were processed for local reduce tasks and hence not shipped to another node for processing. We also introduce a scheduling strategy taking full advantage of these recovery algorithms. We implemented RAFT on top of Hadoop and evaluated it on a 45-node cluster using three common analytical tasks. Overall, our experimental results demonstrate that RAFT outperforms Hadoop runtimes by 23% on average under task and node failures. The results also show that RAFT has negligible runtime overhead.\",\"PeriodicalId\":332374,\"journal\":{\"name\":\"2011 IEEE 27th International Conference on Data Engineering\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"83\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 27th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2011.5767877\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2011.5767877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 83

摘要

MapReduce是一种非常受欢迎的计算范例，因为它允许非专业用户轻松地在非常大规模的情况下运行复杂的分析任务。在这样的规模下，任务和节点故障不再是例外，而是大规模系统的一个特征。这使得容错成为任何应用程序有效操作的关键问题。MapReduce会自动将失败的任务重新调度到可用的节点上，这些节点会重新计算失败的任务。但是，此策略会显著降低应用程序的性能。在本文中，我们提出了一组用于快速跟踪(RAFT) MapReduce的恢复算法。由于易用性是MapReduce的一个主要特性，RAFT专注于简单性和非侵入性，以便独立于实现。为了有效地从任务失败中恢复，RAFT利用了MapReduce在几个时间点产生并持久保存中间结果的事实。RAFT在任务进度计算上附带检查点。为了处理多节点故障，我们提出了查询元数据检查点。我们跟踪所有reduce任务的输入键值对和中间数据之间的映射。因此，RAFT不需要完全重新执行已完成的映射任务。相反，RAFT只重新计算为本地reduce任务处理的中间数据，因此不会将其发送到另一个节点进行处理。我们还介绍了一种充分利用这些恢复算法的调度策略。我们在Hadoop之上实现了RAFT，并使用三个常见的分析任务在一个45节点的集群上对其进行了评估。总的来说，我们的实验结果表明，在任务和节点故障情况下，RAFT的运行时性能平均比Hadoop高出23%。结果还表明，RAFT的运行时开销可以忽略不计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RAFTing MapReduce: Fast recovery on the RAFT

MapReduce is a computing paradigm that has gained a lot of popularity as it allows non-expert users to easily run complex analytical tasks at very large-scale. At such scale, task and node failures are no longer an exception but rather a characteristic of large-scale systems. This makes fault-tolerance a critical issue for the efficient operation of any application. MapReduce automatically reschedules failed tasks to available nodes, which in turn recompute such tasks from scratch. However, this policy can significantly decrease performance of applications. In this paper, we propose a family of Recovery Algorithms for Fast-Tracking (RAFT) MapReduce. As ease-of-use is a major feature of MapReduce, RAFT focuses on simplicity and also non-intrusiveness, in order to be implementation-independent. To efficiently recover from task failures, RAFT exploits the fact that MapReduce produces and persists intermediate results at several points in time. RAFT piggy-backs checkpoints on the task progress computation. To deal with multiple node failures, we propose query metadata checkpointing. We keep track of the mapping between input key-value pairs and intermediate data for all reduce tasks. Thereby, RAFT does not need to re-execute completed map tasks entirely. Instead RAFT only recomputes intermediate data that were processed for local reduce tasks and hence not shipped to another node for processing. We also introduce a scheduling strategy taking full advantage of these recovery algorithms. We implemented RAFT on top of Hadoop and evaluated it on a 45-node cluster using three common analytical tasks. Overall, our experimental results demonstrate that RAFT outperforms Hadoop runtimes by 23% on average under task and node failures. The results also show that RAFT has negligible runtime overhead.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE 27th International Conference on Data Engineering

自引率

0.00%

发文量