Planning for performance: persistent collective operations for MPI

Proceedings of the 24th European MPI Users' Group Meeting Pub Date : 2017-09-25 DOI:10.1145/3127024.3127028

B. Morgan, Daniel J. Holmes, A. Skjellum, P. Bangalore, Srinivas Sridharan

{"title":"Planning for performance: persistent collective operations for MPI","authors":"B. Morgan, Daniel J. Holmes, A. Skjellum, P. Bangalore, Srinivas Sridharan","doi":"10.1145/3127024.3127028","DOIUrl":null,"url":null,"abstract":"Advantages of nonblocking collective communication in MPI have been established over the past quarter century, even predating MPI-1. For regular computations with fixed communication patterns, more optimizations can be revealed through the use of persistence (planned transfers) not currently available in the MPI-3 API except for a limited form of point-to-point persistence (aka half-channels) standardized since MPI-1. This paper covers the design, prototype implementation of LibPNBC (based on LibNBC), and MPI-4 standardization status of persistent nonblocking collective operations. We provide early performance results, using a modified version of NBCBench and an example illustrating the potential performance enhancements for such operations. Persistent operations allow MPI implementations to make intelligent choices about algorithm and resource utilization once and amortize this decision cost across many uses in a long-running program. Evidence that this approach is of value is provided. As with non-persistent, nonblocking collective operations, the requirement for strong progress and blocking completion notification are jointly needed to maximize the benefit of such operations (e.g., overlap of communication with computation or other communication). Further enhancement of the current implementation prototype as well as additional opportunities to enhance performance through the application of these new APIs comprise future work.","PeriodicalId":118516,"journal":{"name":"Proceedings of the 24th European MPI Users' Group Meeting","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3127024.3127028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Advantages of nonblocking collective communication in MPI have been established over the past quarter century, even predating MPI-1. For regular computations with fixed communication patterns, more optimizations can be revealed through the use of persistence (planned transfers) not currently available in the MPI-3 API except for a limited form of point-to-point persistence (aka half-channels) standardized since MPI-1. This paper covers the design, prototype implementation of LibPNBC (based on LibNBC), and MPI-4 standardization status of persistent nonblocking collective operations. We provide early performance results, using a modified version of NBCBench and an example illustrating the potential performance enhancements for such operations. Persistent operations allow MPI implementations to make intelligent choices about algorithm and resource utilization once and amortize this decision cost across many uses in a long-running program. Evidence that this approach is of value is provided. As with non-persistent, nonblocking collective operations, the requirement for strong progress and blocking completion notification are jointly needed to maximize the benefit of such operations (e.g., overlap of communication with computation or other communication). Further enhancement of the current implementation prototype as well as additional opportunities to enhance performance through the application of these new APIs comprise future work.

查看原文本刊更多论文

绩效计划:MPI的持续集体操作

在MPI中，非阻塞集体通信的优势在过去的25年里就已经确立，甚至在MPI-1出现之前。对于具有固定通信模式的常规计算，可以通过使用MPI-3 API中目前不提供的持久性(计划传输)来显示更多的优化，除了自MPI-1以来标准化的有限形式的点对点持久性(又名半通道)。本文介绍了LibPNBC的设计、原型实现(基于libbc)以及持久非阻塞集体操作的MPI-4标准化现状。我们提供了使用修改版本的NBCBench的早期性能结果，并举例说明了此类操作的潜在性能增强。持久操作允许MPI实现一次就算法和资源利用做出明智的选择，并在长时间运行的程序中的许多使用中分摊决策成本。提供了这种方法有价值的证据。对于非持久、非阻塞的集体操作，需要同时要求强进度和阻塞完成通知，以最大化此类操作的好处(例如，与计算或其他通信的通信重叠)。未来的工作包括进一步增强当前的实现原型，以及通过应用这些新api来提高性能的其他机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 24th European MPI Users' Group Meeting

自引率

0.00%

发文量