A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters

2012 41st International Conference on Parallel Processing Workshops Pub Date : 2012-09-10 DOI:10.1109/ICPPW.2012.82

Sam White, Niels J. Verosky, T. Newhall

引用次数: 17

Abstract

We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters.

查看原文本刊更多论文

GPU集群的CUDA-MPI混合双声排序算法

我们提出了一种混合CUDA-MPI排序算法，它利用GPU集群对大型数据集进行排序。我们的算法有两个阶段。在第一阶段，每个节点使用并行双次排序对其GPU上的一部分数据进行排序。在第二阶段，使用MPI在集群节点上实现的约简排序网络并行地将排序后的子序列合并在一起。将我们的排序算法与顺序快速排序进行比较的性能结果显示，在32节点GPU集群上对4GB数据进行排序时，加速值高达9.8。我们期望在更大的数据集和更大的集群上使用我们的算法获得更好的加速值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 41st International Conference on Parallel Processing Workshops

自引率

0.00%

发文量