Partitioning unstructured computational graphs for nonunifor

IEEE Parallel & Distributed Technology: Systems & Applications Pub Date : 1995-09-01 DOI:10.1109/M-PDT.1995.414844

M. Kaddoura, Chao-Wei Ou, S. Ranka

{"title":"Partitioning unstructured computational graphs for nonunifor","authors":"M. Kaddoura, Chao-Wei Ou, S. Ranka","doi":"10.1109/M-PDT.1995.414844","DOIUrl":null,"url":null,"abstract":"In heterogeneous computing environments, computational resources can have a nonuniform distribution that changes over time. To execute in such an environment, many irregular and loosely synchronous data-parallel applications must be carefully mapped. This article examines algorithms that provide this mapping by efficiently partitioning the computational graphs of these applications.Heterogeneity has become commonplace in high-performance computing environments. In the future most computing environments will consist of a cluster of nodes connected by a high-speed interconnection network. Node architectures will include high-performance SIMD and MIMD parallel computers as well as numerous high-performance workstations.In a heterogeneous environment, users can pool many computational resources to create a large virtual machine. This environment can be nonuniform -- that is, the machines or processors can have different computational powers. However, the pool of resources might change over the computation's lifetime because of machine failures or differing use patterns. It should be possible to add or remove resources without significantly affecting the other machines or changing the existing software. In such an adaptive environment, an individual machine could either be dedicated to a single user's computation or shared by users. The former strategy has the advantage that each machine has static computing capability, while the latter has the advantage of a higher rate of use.In this article we'll examine the mapping requirements for the parallelization of a large class of irregular and loosely synchronous data-parallel applications on nonuniform and adaptive environments. The computational structure of these applications can be described as a computational graph. In such a graph, nodes represent computational tasks and edges describe the communication between tasks.For many applications, the graph's vertices correspond to 2D and 3D coordinates, and the interaction between computations is limited to physically proximate vertices. Recursive coordinate bisection, index-based mapping, and recursive spectral bisection can exploit these properties to partition such applications. Essentially, these algorithms cluster proximate points together to form a partition such that the numbers of vertices attached to every partition are equal.Other researchers have used these algorithms to map graphs onto uniform parallel machines. We'll evaluate how the algorithms partition computational graphs on a simulation of a cluster of machines constituting a static, nonuniform environment. (In a static environment, computational resources are fixed throughout the completion of all tasks.) The algorithms assume that an interconnection network connects all the processors and that the cost of unit communication is the same between all the processors. (A bus is an example of such a network.) Although our algorithms specifically target a network-connected cluster of workstations, the issues are similar for parallelizing such applications on a network of machines.We'll also show how to use or extend these algorithms for an adaptive environment. Mapping graph vertices onto a 1D space can facilitate extremely fast remapping when the environment changes. This simple remapping achieves acceptable partitioning, though poorer than with mapping from scratch.","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Parallel & Distributed Technology: Systems & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/M-PDT.1995.414844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 42

Abstract

In heterogeneous computing environments, computational resources can have a nonuniform distribution that changes over time. To execute in such an environment, many irregular and loosely synchronous data-parallel applications must be carefully mapped. This article examines algorithms that provide this mapping by efficiently partitioning the computational graphs of these applications.Heterogeneity has become commonplace in high-performance computing environments. In the future most computing environments will consist of a cluster of nodes connected by a high-speed interconnection network. Node architectures will include high-performance SIMD and MIMD parallel computers as well as numerous high-performance workstations.In a heterogeneous environment, users can pool many computational resources to create a large virtual machine. This environment can be nonuniform -- that is, the machines or processors can have different computational powers. However, the pool of resources might change over the computation's lifetime because of machine failures or differing use patterns. It should be possible to add or remove resources without significantly affecting the other machines or changing the existing software. In such an adaptive environment, an individual machine could either be dedicated to a single user's computation or shared by users. The former strategy has the advantage that each machine has static computing capability, while the latter has the advantage of a higher rate of use.In this article we'll examine the mapping requirements for the parallelization of a large class of irregular and loosely synchronous data-parallel applications on nonuniform and adaptive environments. The computational structure of these applications can be described as a computational graph. In such a graph, nodes represent computational tasks and edges describe the communication between tasks.For many applications, the graph's vertices correspond to 2D and 3D coordinates, and the interaction between computations is limited to physically proximate vertices. Recursive coordinate bisection, index-based mapping, and recursive spectral bisection can exploit these properties to partition such applications. Essentially, these algorithms cluster proximate points together to form a partition such that the numbers of vertices attached to every partition are equal.Other researchers have used these algorithms to map graphs onto uniform parallel machines. We'll evaluate how the algorithms partition computational graphs on a simulation of a cluster of machines constituting a static, nonuniform environment. (In a static environment, computational resources are fixed throughout the completion of all tasks.) The algorithms assume that an interconnection network connects all the processors and that the cost of unit communication is the same between all the processors. (A bus is an example of such a network.) Although our algorithms specifically target a network-connected cluster of workstations, the issues are similar for parallelizing such applications on a network of machines.We'll also show how to use or extend these algorithms for an adaptive environment. Mapping graph vertices onto a 1D space can facilitate extremely fast remapping when the environment changes. This simple remapping achieves acceptable partitioning, though poorer than with mapping from scratch.

查看原文本刊更多论文

非统一的非结构化计算图的划分

在异构计算环境中，计算资源可能具有随时间变化的不均匀分布。要在这样的环境中执行，必须仔细映射许多不规则和松散同步的数据并行应用程序。本文通过有效地划分这些应用程序的计算图来研究提供这种映射的算法。异构性在高性能计算环境中已经司空见惯。在未来，大多数计算环境将由由高速互连网络连接的节点集群组成。节点架构将包括高性能SIMD和MIMD并行计算机以及众多高性能工作站。在异构环境中，用户可以将许多计算资源集中起来创建大型虚拟机。这种环境可以是不一致的——也就是说，机器或处理器可以具有不同的计算能力。但是，由于机器故障或不同的使用模式，资源池可能会在计算的生命周期中发生变化。应该可以在不显著影响其他计算机或更改现有软件的情况下添加或删除资源。在这种自适应环境中，一台单独的机器可以专门用于单个用户的计算，也可以由用户共享。前者的优点是每台机器都有静态计算能力，而后者的优点是使用率更高。在本文中，我们将研究在非统一和自适应环境中并行化一大类不规则和松散同步数据并行应用程序的映射需求。这些应用程序的计算结构可以用计算图来描述。在这种图中，节点表示计算任务，边描述任务之间的通信。对于许多应用程序，图形的顶点对应于2D和3D坐标，并且计算之间的交互仅限于物理上接近的顶点。递归坐标对分、基于索引的映射和递归谱对分可以利用这些属性来划分此类应用程序。本质上，这些算法将近似点聚在一起形成一个分区，使得每个分区上的顶点数量相等。其他研究人员使用这些算法将图形映射到统一的并行机器上。我们将评估算法如何在一个模拟的机器集群上划分计算图，这些机器集群构成一个静态的、不统一的环境。(在静态环境中，计算资源在完成所有任务的过程中是固定的。)该算法假定所有处理器之间有一个互连网络，并且所有处理器之间的单位通信成本相同。(总线就是这种网络的一个例子。)尽管我们的算法专门针对网络连接的工作站集群，但在机器网络上并行化这些应用程序的问题也是类似的。我们还将展示如何在自适应环境中使用或扩展这些算法。当环境发生变化时，将图形顶点映射到一维空间可以非常快速地重新映射。这种简单的重新映射实现了可接受的分区，尽管不如从头映射。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Parallel & Distributed Technology: Systems & Applications

自引率

0.00%

发文量