Multi-GPU MapReduce on GPU Clusters

2011 IEEE International Parallel & Distributed Processing Symposium Pub Date : 2011-05-16 DOI:10.1109/IPDPS.2011.102

Jeff A. Stuart, John Douglas Owens

引用次数: 217

Abstract

We present GPMR, our stand-alone MapReduce library that leverages the power of GPU clusters for large-scale computing. To better utilize the GPU, we modify MapReduce by combining large amounts of map and reduce items into chunks and using partial reductions and accumulation. We use persistent map and reduce tasks and stress aspects of GPMR with a set of standard MapReduce benchmarks. We run these benchmarks on a GPU cluster and achieve desirable speedup and efficiency for all benchmarks. We compare our implementation to the current-best GPU-MapReduce library (runs only on a solo GPU) and a highly-optimized multi-core MapReduce to show the power of GPMR. We demonstrate how typical MapReduce tasks are easily modified to fit into GPMR and leverage a GPU cluster. We highlight how total and relative amounts of communication affect GPMR. We conclude with an exposition on the types of MapReduce tasks well-suited to GPMR, and why some tasks need more modifications than others to work well with GPMR.

查看原文本刊更多论文

GPU集群上的多GPU MapReduce

我们展示了GPMR，我们的独立MapReduce库，它利用GPU集群的强大功能进行大规模计算。为了更好地利用GPU，我们通过将大量map和reduce项组合成块并使用部分约简和累积来修改MapReduce。我们通过一组标准的MapReduce基准来使用持久映射和减少GPMR的任务和压力方面。我们在GPU集群上运行这些基准测试，并为所有基准测试实现了理想的加速和效率。我们将我们的实现与当前最好的GPU-MapReduce库(仅在单个GPU上运行)和高度优化的多核MapReduce进行比较，以显示GPMR的强大功能。我们将演示如何轻松修改典型的MapReduce任务以适应GPMR并利用GPU集群。我们强调了通信总量和相对量如何影响GPMR。最后，我们阐述了适合GPMR的MapReduce任务类型，以及为什么有些任务需要更多的修改才能与GPMR很好地配合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE International Parallel & Distributed Processing Symposium

自引率

0.00%

发文量