A Locality-conscious load-balancer based on negotiations in dynamic unstructured mesh computations

Parallel Algorithms and Applications Pub Date : 2004-06-01 DOI:10.1080/10637190412331279966

A. Mohamed, Veysel S. Baydogan

{"title":"A Locality-conscious load-balancer based on negotiations in dynamic unstructured mesh computations","authors":"A. Mohamed, Veysel S. Baydogan","doi":"10.1080/10637190412331279966","DOIUrl":null,"url":null,"abstract":"Recently hybrid/multi-level parallel programming models are gaining lots of momentum basically because they have proven to provide better scalability, speedup and utilization than any single parallel programming model alone. In such models, load balancing should not only mean balancing the computational loads (as it has always been perceived), but should also mean balancing I/O imbalance as well as synchronization imbalance. In this paper, we propose a broader generic application/language/model independent multi-agent framework for dynamic load balancing. It takes most of the load-balancing burden away from programmers. It is not a library but a runtime support system that is not hardwired to the parallel applications. The framework is intended to handle varying levels of load changes in computations, I/O and/or synchronization throughout the application run and it is an open-architecture that currently supports four multi-level parallel programming models. It has a clean interface to the application, runs in parallel and provides additional functionality such as determination of when to balance load and provide interface to end users. The proposed open-architecture multi-agent load-balancing capability currently makes use of a leading geometric partitioner engine (Chaco) at runtime. A mesh solver may initially create hundreds of lightweight threads, each handling a small submesh by calling Chaco partitioning engine in a pre-processing stage. This partitioner engine might be called again by these light-weight threads if a divide-and-conquer process is deemed necessary when the sub-domain (submesh) served by this thread grows out beyond certain threshold limits and thus creates an imbalance. In the proposed framework, the multi-agent is a set of SMP-based load balancers (agents) that do not have to share any data structure with the parallel application threads. They just monitor and collect system and application data frequently from the outside of the multi-threaded parallel application solver and send adjustments and negotiation plans to the SMP-load balancers and the application threads whenever a need for load balancing arises. The proposed framework has been deployed in four hybrid/multi-level parallel programming models and its capabilities of issuing corrective actions against emerging imbalances were tested in the context of an adaptive mesh refinement application. Experimental results show that the framework is effective in monitoring, tuning and rebalancing emerging computational, I/O and synchronization sources of load imbalance.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Algorithms and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/10637190412331279966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Recently hybrid/multi-level parallel programming models are gaining lots of momentum basically because they have proven to provide better scalability, speedup and utilization than any single parallel programming model alone. In such models, load balancing should not only mean balancing the computational loads (as it has always been perceived), but should also mean balancing I/O imbalance as well as synchronization imbalance. In this paper, we propose a broader generic application/language/model independent multi-agent framework for dynamic load balancing. It takes most of the load-balancing burden away from programmers. It is not a library but a runtime support system that is not hardwired to the parallel applications. The framework is intended to handle varying levels of load changes in computations, I/O and/or synchronization throughout the application run and it is an open-architecture that currently supports four multi-level parallel programming models. It has a clean interface to the application, runs in parallel and provides additional functionality such as determination of when to balance load and provide interface to end users. The proposed open-architecture multi-agent load-balancing capability currently makes use of a leading geometric partitioner engine (Chaco) at runtime. A mesh solver may initially create hundreds of lightweight threads, each handling a small submesh by calling Chaco partitioning engine in a pre-processing stage. This partitioner engine might be called again by these light-weight threads if a divide-and-conquer process is deemed necessary when the sub-domain (submesh) served by this thread grows out beyond certain threshold limits and thus creates an imbalance. In the proposed framework, the multi-agent is a set of SMP-based load balancers (agents) that do not have to share any data structure with the parallel application threads. They just monitor and collect system and application data frequently from the outside of the multi-threaded parallel application solver and send adjustments and negotiation plans to the SMP-load balancers and the application threads whenever a need for load balancing arises. The proposed framework has been deployed in four hybrid/multi-level parallel programming models and its capabilities of issuing corrective actions against emerging imbalances were tested in the context of an adaptive mesh refinement application. Experimental results show that the framework is effective in monitoring, tuning and rebalancing emerging computational, I/O and synchronization sources of load imbalance.

查看原文本刊更多论文

动态非结构化网格计算中基于协商的位置感知负载均衡器

最近，混合/多级并行编程模型获得了大量的动力，主要是因为它们已被证明比任何单独的并行编程模型提供了更好的可伸缩性、加速和利用率。在这样的模型中，负载平衡不仅意味着平衡计算负载(就像人们一直认为的那样)，还意味着平衡I/O不平衡和同步不平衡。在本文中，我们提出了一个更广泛的通用应用/语言/模型无关的多智能体动态负载平衡框架。它从程序员那里免去了大部分负载平衡的负担。它不是一个库，而是一个运行时支持系统，没有硬连接到并行应用程序。该框架旨在处理在整个应用程序运行过程中计算、I/O和/或同步中不同级别的负载变化，它是一个开放的体系结构，目前支持四种多级并行编程模型。它有一个清晰的应用程序接口，并行运行，并提供额外的功能，如确定何时平衡负载和向最终用户提供接口。所提出的开放体系结构多代理负载平衡能力目前在运行时使用了领先的几何分区引擎(Chaco)。网格求解器最初可能会创建数百个轻量级线程，每个线程在预处理阶段通过调用Chaco分区引擎来处理一个小的子网格。当这个线程所服务的子域(子网格)超出一定的阈值限制，从而造成不平衡时，如果认为需要分而治之的进程，这些轻量级线程可能会再次调用这个分区器引擎。在建议的框架中，多代理是一组基于smp的负载平衡器(代理)，它们不必与并行应用程序线程共享任何数据结构。它们只是从多线程并行应用程序求解器外部频繁地监视和收集系统和应用程序数据，并在需要负载平衡时向smp负载平衡器和应用程序线程发送调整和协商计划。提出的框架已部署在四个混合/多级并行编程模型中，并在自适应网格细化应用程序的背景下测试了其针对新出现的不平衡发出纠正行动的能力。实验结果表明，该框架在监测、调优和重新平衡新出现的计算、I/O和同步负载不平衡源方面是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Parallel Algorithms and Applications

自引率

0.00%

发文量