Efficient algorithms for multi-dimensional block-cyclic redistribution of arrays

Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162) Pub Date : 1997-08-11 DOI:10.1109/ICPP.1997.622650

Y. Lim, Neungsoo Park, V. Prasanna

引用次数: 4

Abstract

We present a uniform framework for a classical problem, redistribution of a multi-dimensional array. Using a generalized circulant matrix formalism, we derive efficient direct, indirect and hybrid contention-free communication schedules. Our indirect schedule reduces the number of communication steps significantly compared with the previous approaches. Our approach exploits the regularity of the block-cyclic redistribution to minimize the index computation overheads. For the case of 2-d redistribution, when the block size increases by factors of K/sub 1/ and K/sub 2/ along each dimension and the process topology remains fixed, our indirect schedule performs the redistribution in O(log(K/sub 1/K/sub 2/)) communication steps. For the case of fixed block size and the processor topology is transposed, our indirect schedule results in O(log(L/G)) communication steps. Implementations of our algorithms on the IBM SP-2 show superior performance over previous approaches.

查看原文本刊更多论文

数组多维块循环再分配的高效算法

我们提出了一个经典问题的统一框架，多维数组的再分配。利用广义循环矩阵的形式，导出了有效的直接、间接和混合无争用通信调度。与以前的方法相比，我们的间接调度大大减少了通信步骤的数量。我们的方法利用了块循环再分配的规律性来最小化索引计算开销。对于二维重分配，当块大小沿每个维度以K/sub 1/和K/sub 2/的因子增加时，并且进程拓扑保持固定，我们的间接调度在O(log(K/sub 1/K/sub 2/))通信步骤中执行重分配。对于固定块大小和处理器拓扑结构调换的情况，我们的间接调度导致O(log(L/G))个通信步骤。我们的算法在IBM SP-2上的实现比以前的方法表现出更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)

自引率

0.00%

发文量