Shortest Processing Time First Algorithm for Hadoop

2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud) Pub Date : 2016-06-25 DOI:10.1109/CSCloud.2016.12

Laurent Bobelin, P. Martineau, Di Zhao, Haiwu He

引用次数: 3

Abstract

Big data has revealed itself as a powerful tool for many sectors ranging from science to business. Distributed data-parallel computing is then common nowadays: using a large number of computing and storage resources makes possible data processing of a yet unknown scale. But to develop large-scale distributed big data processing, we have to tackle many challenges. One of the most complex is scheduling. As it is known to be an optimal online scheduling policy when it comes to minimize the average flowtime, Shortest Processing Time First (SPT) is a classic scheduling policy used in many systems. We then decided to integrate this policy into Hadoop, a framework for big data processing, and realize an implementation prototype. This paper describes this integration, as well as tests results obtained on our testbed.

查看原文本刊更多论文

Hadoop的最短处理时间优先算法

从科学到商业，大数据已经成为许多领域的强大工具。分布式数据并行计算现在很常见:使用大量的计算和存储资源使得未知规模的数据处理成为可能。但要发展大规模分布式大数据处理，还需要解决许多挑战。其中最复杂的是日程安排。当涉及到最小化平均流时间时，它被认为是最优的在线调度策略，因此最短处理时间优先(SPT)是许多系统中使用的经典调度策略。然后我们决定将这个策略集成到Hadoop这个大数据处理框架中，并实现一个实现原型。本文描述了这种集成，以及在我们的测试平台上得到的测试结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)

自引率

0.00%

发文量