Characterizing MPI matching via trace-based simulation

Proceedings of the 24th European MPI Users' Group Meeting Pub Date : 2017-09-25 DOI:10.1145/3127024.3127040

Kurt B. Ferreira, Scott Levy, K. Pedretti, Ryan E. Grant

{"title":"Characterizing MPI matching via trace-based simulation","authors":"Kurt B. Ferreira, Scott Levy, K. Pedretti, Ryan E. Grant","doi":"10.1145/3127024.3127040","DOIUrl":null,"url":null,"abstract":"With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.","PeriodicalId":118516,"journal":{"name":"Proceedings of the 24th European MPI Users' Group Meeting","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3127024.3127040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.

查看原文本刊更多论文

通过基于轨迹的仿真表征MPI匹配

随着未来领导级系统规模的增加，有关MPI消息匹配的资源使用和性能的详细信息为如何在下一代系统上维护应用程序性能提供了重要的见解。但是，如果不付出巨大的努力，通常不可能获得MPI消息匹配性能数据。一种常见的方法是利用MPI实现来收集相关的统计数据。虽然这种方法可以提供重要的数据，但在运行时收集匹配数据会干扰应用程序的执行，包括其匹配性能，并且高度依赖于MPI库的匹配列表实现。在本文中，我们介绍了一种基于跟踪的模拟方法，以在不干扰MPI应用程序执行的情况下获得详细的MPI消息匹配性能数据。通过使用一些关键的并行工作负载，我们证明了这种模拟器方法可以快速准确地表征匹配行为。具体来说，我们使用模拟器收集关于MPI发布和意外队列操作的几个重要统计信息。例如，我们提供关于搜索长度和消息在等待匹配的队列中花费的持续时间的数据。使用这种基于模拟的方法收集的数据具有很大的潜力，可以帮助硬件设计人员确定MPI匹配功能的资源分配，并为应用程序和中间件开发人员提供与MPI消息匹配相关的可伸缩性问题的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 24th European MPI Users' Group Meeting

自引率

0.00%

发文量