A generalized basic cycle calculation method for efficient array redistribution

Ching-Hsien Hsu, Sheng-Wen Bai, Yeh-Ching Chung, Chu-Sing Yang
{"title":"A generalized basic cycle calculation method for efficient array redistribution","authors":"Ching-Hsien Hsu, Sheng-Wen Bai, Yeh-Ching Chung, Chu-Sing Yang","doi":"10.1109/ICPADS.1998.741147","DOIUrl":null,"url":null,"abstract":"In many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. We present a generalized basic cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination processor/data sets of array elements in the first generalized basic cycle of the local array it owns. A generalized basic cycle is defined as lcm(sP,tQ)/(gcd(s,t)/spl times/P) in the source distribution and lcm(sP,tQ)/(gcd(s,t)/spl times/Q) in the destination distribution. From the source/destination processor/data sets of array elements in the first generalized basic cycle, we can construct packing/unpacking pattern tables. Based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the GBCC method, we have implemented this method on an IBM SP2 parallel machine, along with the PITFALLS method and the ScaLAPACK method. The cost models for these three methods are also presented. The experimental results show that the GBCC method outperforms the PITFALLS method and the ScaLAPACK method for all test samples. A brief description of the extension of the GBCC method to multi dimensional array redistributions is also presented.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS.1998.741147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

Abstract

In many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. We present a generalized basic cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination processor/data sets of array elements in the first generalized basic cycle of the local array it owns. A generalized basic cycle is defined as lcm(sP,tQ)/(gcd(s,t)/spl times/P) in the source distribution and lcm(sP,tQ)/(gcd(s,t)/spl times/Q) in the destination distribution. From the source/destination processor/data sets of array elements in the first generalized basic cycle, we can construct packing/unpacking pattern tables. Based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the GBCC method, we have implemented this method on an IBM SP2 parallel machine, along with the PITFALLS method and the ScaLAPACK method. The cost models for these three methods are also presented. The experimental results show that the GBCC method outperforms the PITFALLS method and the ScaLAPACK method for all test samples. A brief description of the extension of the GBCC method to multi dimensional array redistributions is also presented.
一种有效阵列重分配的广义基本周期计算方法
在许多科学应用中,通常需要动态数组重新分配来提高算法的性能。我们提出了一种广义基本周期计算(GBCC)方法,以有效地执行P个处理器上的BLOCK-CYCLIC(s)到Q个处理器上的BLOCK-CYCLIC(t)数组重新分配。在GBCC方法中,处理器首先在其拥有的本地数组的第一个广义基本循环中计算数组元素的源/目标处理器/数据集。广义基本循环在源分布中定义为lcm(sP,tQ)/(gcd(s,t)/ sp1次/P),在目标分布中定义为lcm(sP,tQ)/(gcd(s,t)/ sp1次/Q)。从第一个广义基本循环中数组元素的源/目标处理器/数据集出发,我们可以构造打包/拆包模式表。基于装箱/拆包模式表,处理器可以有效地对数组元素进行装箱/拆包。为了评估GBCC方法的性能,我们在IBM SP2并行机上实现了该方法,以及陷阱方法和ScaLAPACK方法。并给出了这三种方法的成本模型。实验结果表明,对于所有测试样本,GBCC方法都优于陷阱方法和ScaLAPACK方法。简要介绍了GBCC方法在多维阵列重分布中的推广。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信