MPI重叠:基准和分析

2016 45th International Conference on Parallel Processing (ICPP) Pub Date : 2016-08-01 DOI:10.1109/ICPP.2016.37

Alexandre Denis, François Trahay

{"title":"MPI重叠:基准和分析","authors":"Alexandre Denis, François Trahay","doi":"10.1109/ICPP.2016.37","DOIUrl":null,"url":null,"abstract":"In HPC applications, one of the major overhead compared to sequential code, is communication cost. Application programmers often amortize this cost by overlapping communications with computation. To do so, they post a non-blocking MPI request, perform computation, and wait for communication completion, assuming MPI communication will progress in background. In this paper, we propose to measure what really happens when trying to overlap non-blocking point-to-point communications with computation. We explain how background progression works, we describe relevant test cases, we identify challenges for a benchmark, then we propose a benchmark suite to measure how much overlap happen in various cases. We exhibit overlap benchmark results on a wide panel of MPI libraries and hardware platforms. Finally, we classify, analyze, and explain the results using low-level traces to reveal the internal behavior of the MPI library.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"MPI Overlap: Benchmark and Analysis\",\"authors\":\"Alexandre Denis, François Trahay\",\"doi\":\"10.1109/ICPP.2016.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In HPC applications, one of the major overhead compared to sequential code, is communication cost. Application programmers often amortize this cost by overlapping communications with computation. To do so, they post a non-blocking MPI request, perform computation, and wait for communication completion, assuming MPI communication will progress in background. In this paper, we propose to measure what really happens when trying to overlap non-blocking point-to-point communications with computation. We explain how background progression works, we describe relevant test cases, we identify challenges for a benchmark, then we propose a benchmark suite to measure how much overlap happen in various cases. We exhibit overlap benchmark results on a wide panel of MPI libraries and hardware platforms. Finally, we classify, analyze, and explain the results using low-level traces to reveal the internal behavior of the MPI library.\",\"PeriodicalId\":409991,\"journal\":{\"name\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2016.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

在高性能计算应用程序中，与顺序代码相比，主要开销之一是通信成本。应用程序程序员通常通过将通信与计算重叠来分摊这一成本。为此，他们发布一个非阻塞的MPI请求，执行计算，并等待通信完成，假设MPI通信将在后台进行。在本文中，我们建议测量在尝试将非阻塞点对点通信与计算重叠时实际发生的情况。我们解释了背景进程是如何工作的，我们描述了相关的测试用例，我们确定了基准测试的挑战，然后我们提出了一个基准测试套件来衡量在各种情况下发生了多少重叠。我们在广泛的MPI库和硬件平台上展示了重叠的基准测试结果。最后，我们使用低级跟踪对结果进行分类、分析和解释，以揭示MPI库的内部行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MPI Overlap: Benchmark and Analysis

In HPC applications, one of the major overhead compared to sequential code, is communication cost. Application programmers often amortize this cost by overlapping communications with computation. To do so, they post a non-blocking MPI request, perform computation, and wait for communication completion, assuming MPI communication will progress in background. In this paper, we propose to measure what really happens when trying to overlap non-blocking point-to-point communications with computation. We explain how background progression works, we describe relevant test cases, we identify challenges for a benchmark, then we propose a benchmark suite to measure how much overlap happen in various cases. We exhibit overlap benchmark results on a wide panel of MPI libraries and hardware platforms. Finally, we classify, analyze, and explain the results using low-level traces to reveal the internal behavior of the MPI library.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 45th International Conference on Parallel Processing (ICPP)

自引率

0.00%

发文量