MPI Overlap: Benchmark and Analysis

2016 45th International Conference on Parallel Processing (ICPP) Pub Date : 2016-08-01 DOI:10.1109/ICPP.2016.37

Alexandre Denis, François Trahay

引用次数: 22

Abstract

In HPC applications, one of the major overhead compared to sequential code, is communication cost. Application programmers often amortize this cost by overlapping communications with computation. To do so, they post a non-blocking MPI request, perform computation, and wait for communication completion, assuming MPI communication will progress in background. In this paper, we propose to measure what really happens when trying to overlap non-blocking point-to-point communications with computation. We explain how background progression works, we describe relevant test cases, we identify challenges for a benchmark, then we propose a benchmark suite to measure how much overlap happen in various cases. We exhibit overlap benchmark results on a wide panel of MPI libraries and hardware platforms. Finally, we classify, analyze, and explain the results using low-level traces to reveal the internal behavior of the MPI library.

查看原文本刊更多论文

MPI重叠:基准和分析

在高性能计算应用程序中，与顺序代码相比，主要开销之一是通信成本。应用程序程序员通常通过将通信与计算重叠来分摊这一成本。为此，他们发布一个非阻塞的MPI请求，执行计算，并等待通信完成，假设MPI通信将在后台进行。在本文中，我们建议测量在尝试将非阻塞点对点通信与计算重叠时实际发生的情况。我们解释了背景进程是如何工作的，我们描述了相关的测试用例，我们确定了基准测试的挑战，然后我们提出了一个基准测试套件来衡量在各种情况下发生了多少重叠。我们在广泛的MPI库和硬件平台上展示了重叠的基准测试结果。最后，我们使用低级跟踪对结果进行分类、分析和解释，以揭示MPI库的内部行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 45th International Conference on Parallel Processing (ICPP)

自引率

0.00%

发文量