{"title":"Micro-Benchmarking MPI Partitioned Point-to-Point Communication","authors":"Yiltan Hassan Temuçin, Ryan E. Grant, A. Afsahi","doi":"10.1145/3545008.3545088","DOIUrl":null,"url":null,"abstract":"Modern High-Performance Computing (HPC) architectures have developed the need for scalable hybrid programming models. The latest Message Passing Interface (MPI) 4.0 standard has introduced a new communication model: MPI Partitioned Point-to-Point communication. This new model allows for the contribution of data from multiple threads with lower overheads than with traditional MPI point-to-point communication. In this paper, we design the first publicly available micro-benchmark suite for MPI Partitioned to measure various metrics that can give insight into the benefits of using this new model and scenarios where MPI point-to-point is better suited. Suggestions are provided to application developers on how to choose partition size for their application based on compute and message size. We evaluate MPI Partitioned communication with both a hot and cold CPU cache, system noise with different probability distributions, point-to-point communication directly, and with commonly used MPI communication patterns such as a halo exchange and Sweep3D.","PeriodicalId":360504,"journal":{"name":"Proceedings of the 51st International Conference on Parallel Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545008.3545088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Modern High-Performance Computing (HPC) architectures have developed the need for scalable hybrid programming models. The latest Message Passing Interface (MPI) 4.0 standard has introduced a new communication model: MPI Partitioned Point-to-Point communication. This new model allows for the contribution of data from multiple threads with lower overheads than with traditional MPI point-to-point communication. In this paper, we design the first publicly available micro-benchmark suite for MPI Partitioned to measure various metrics that can give insight into the benefits of using this new model and scenarios where MPI point-to-point is better suited. Suggestions are provided to application developers on how to choose partition size for their application based on compute and message size. We evaluate MPI Partitioned communication with both a hot and cold CPU cache, system noise with different probability distributions, point-to-point communication directly, and with commonly used MPI communication patterns such as a halo exchange and Sweep3D.