Implementation and performance analysis of non-blocking collective operations for MPI

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07) Pub Date : 2007-11-16 DOI:10.1145/1362622.1362692

T. Hoefler, A. Lumsdaine, W. Rehm

{"title":"Implementation and performance analysis of non-blocking collective operations for MPI","authors":"T. Hoefler, A. Lumsdaine, W. Rehm","doi":"10.1145/1362622.1362692","DOIUrl":null,"url":null,"abstract":"Collective operations and non-blocking point-to-point operations have always been part of MPI. Although non-blocking collective operations are an obvious extension to MPI, there have been no comprehensive studies of this functionality. In this paper we present LibNBC, a portable high-performance library for implementing non-blocking collective MPI communication operations. LibNBC provides non-blocking versions of all MPI collective operations, is layered on top of MPI-1, and is portable to nearly all parallel architectures. To measure the performance characteristics of our implementation, we also present a microbenchmark for measuring both latency and overlap of computation and communication. Experimental results demonstrate that the blocking performance of the collective operations in our library is comparable to that of collective operations in other high-performance MPI implementations. Our library introduces a very low overhead between the application and the underlying MPI and thus, in conjunction with the potential to overlap communication with computation, offers the potential for optimizing real-world applications.","PeriodicalId":274744,"journal":{"name":"Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"201","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1362622.1362692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 201

Abstract

Collective operations and non-blocking point-to-point operations have always been part of MPI. Although non-blocking collective operations are an obvious extension to MPI, there have been no comprehensive studies of this functionality. In this paper we present LibNBC, a portable high-performance library for implementing non-blocking collective MPI communication operations. LibNBC provides non-blocking versions of all MPI collective operations, is layered on top of MPI-1, and is portable to nearly all parallel architectures. To measure the performance characteristics of our implementation, we also present a microbenchmark for measuring both latency and overlap of computation and communication. Experimental results demonstrate that the blocking performance of the collective operations in our library is comparable to that of collective operations in other high-performance MPI implementations. Our library introduces a very low overhead between the application and the underlying MPI and thus, in conjunction with the potential to overlap communication with computation, offers the potential for optimizing real-world applications.

查看原文本刊更多论文

MPI非阻塞集合操作的实现与性能分析

集体操作和非阻塞点对点操作一直是MPI的一部分。虽然非阻塞集体操作是MPI的明显扩展，但尚未对该功能进行全面研究。在本文中，我们提出了libbc，一个可移植的高性能库，用于实现非阻塞的集体MPI通信操作。libbc提供了所有MPI集合操作的非阻塞版本，它位于MPI-1之上，并且几乎可以移植到所有并行体系结构中。为了测量我们实现的性能特征，我们还提供了一个微基准，用于测量计算和通信的延迟和重叠。实验结果表明，本文库中集合操作的阻塞性能与其他高性能MPI实现中的集合操作相当。我们的库在应用程序和底层MPI之间引入了非常低的开销，因此，结合通信与计算重叠的潜力，提供了优化实际应用程序的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07)

自引率

0.00%

发文量