Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing最新文献_第8页

Embedding of k-ary complete trees into hypercubes with optimal load 具有最优负载的k元完全树嵌入超立方体

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570390

Jan Trdlicka, P. Tvrdík

{"title":"Embedding of k-ary complete trees into hypercubes with optimal load","authors":"Jan Trdlicka, P. Tvrdík","doi":"10.1109/SPDP.1996.570390","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570390","url":null,"abstract":"The main result of the paper is an algorithm for embedding k-ary complete trees into hypercubes with optimal load and asymptotically optimal dilation. The algorithm is fully scalable, the dimension of the hypercube can be chosen independently of the arity and height of the complete tree. The basic property of the embedded tree is that both all the tree nodes at a given level and all the tree nodes together are uniformly distributed within equally-sized subcubes of the hypercube. This implies that no hypercube node is loaded with more than [A/sub h//2/sup n/] tree nodes and [B/sub h//2/sup n/] leaves of the tree, where A/sub h/ is the number of all tree nodes, B/sub h/ is the number of leaves of the k-ary complete tree of height h, and n is the dimension of the hypercube. The embedding enables optimal emulations of both divide and conquer computations on the k-ary complete tree, where only one level of nodes is active at a time, and general computations based on k-ary complete trees, where all tree nodes are active simultaneously. As a special case the authors obtain an algorithm for embedding the k-ary complete tree of height h into its optimal hypercube with load 1 and with dilation that is only by a small constant factor worse than the lower bound. This improves the best previous result by Shen et al. (1995), whose embedding has load 1 and nearly optimal dilation, but requires much larger than the optimal hypercube.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121210858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Extending functional languages with stateful computations 用状态计算扩展函数式语言

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570381

Yung-Syau Chen, J. Gaudiot

引用次数: 1

A compiler address transformation for conflict-free access of memories and networks 一个编译器地址转换，用于内存和网络的无冲突访问

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570378

M. Al-Mouhamed, L. Bic, Husam Abu-Haimed

{"title":"A compiler address transformation for conflict-free access of memories and networks","authors":"M. Al-Mouhamed, L. Bic, Husam Abu-Haimed","doi":"10.1109/SPDP.1996.570378","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570378","url":null,"abstract":"A method for mapping arrays into parallel memories to minimize serialization and network conflicts for lock-step systems is presented. Each array is associated an arbitrary number of data access patterns that can be identified following compiler data-dependence analysis. Conditions for conflict-free access of parallel memories and network are derived for arbitrary power-of-2 data patterns and arbitrary multistage networks. The authors propose an efficient heuristic to synthesize combined address transformation (NP complete) which applies to arbitrary linear patterns, arbitrary multistage networks, and an arbitrary number of power-of-2 memories. The method can be implemented as part of the address transformation (Xor and And) or through compiler emulation. The performance of optimized storage schemes is presented for FFT, arbitrary sets of data patterns, non power-of-2 stride access in vector processors, interleaving, and static row-column storages. Their approach is profitable in all the above cases and provides a systematic method for converting array-memory mapping and network aspects of algorithms from one network topology to another.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116430567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

An empirical study of dynamic scheduling on rings of processors 处理器环上动态调度的实证研究

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570370

M. E. Barrows, Dawn E. Gregory, Lixin Gao, A. Rosenberg, P. Cohen

引用次数: 8

Performance of parallel algorithms for a fingerprint image comparison system 指纹图像比较系统的并行算法性能

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570362

H. Ammar, Zhouhui Miao

引用次数: 0

An efficient parallel scheduling algorithm 一种高效的并行调度算法

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570342

Minyou Wu

引用次数: 4

Impact of load balancing on unstructured adaptive grid computations for distributed-memory multiprocessors 负载平衡对分布式内存多处理器非结构化自适应网格计算的影响

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570313

A. Sohn, R. Biswas, H. Simon

引用次数: 28

Measurement and simulation based performance analysis of parallel I/O in a high-performance cluster system 基于测量和仿真的高性能集群系统并行I/O性能分析

Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing Pub Date : 1996-10-23 DOI: 10.1109/SPDP.1996.570351

C. Natarajan, R. Iyer

引用次数: 3