MPI持续性集体在MPC的实施与绩效评估:个案研究

Proceedings of the 27th European MPI Users' Group Meeting Pub Date : 2020-09-21 DOI:10.1145/3416315.3416321

Stéphane Bouhrour, Julien Jaeger

{"title":"MPI持续性集体在MPC的实施与绩效评估:个案研究","authors":"Stéphane Bouhrour, Julien Jaeger","doi":"10.1145/3416315.3416321","DOIUrl":null,"url":null,"abstract":"Persistent collective communications have recently been voted in the MPI standard, opening the door to many optimizations to reduce collectives cost, in particular for recurring operations. Indeed persistent semantics contains an initialization phase called only once for a specific collective. It can be used to collect building costs necessary to the collective, to avoid paying them each time the operation is performed. We propose an overview of the implementation of the persistent collectives in the MPC MPI runtime. We first present a naïve implementation for MPI runtimes already providing nonblocking collectives. Then, we improve this first implementation with two levels of caching optimizations. We present the performance results of the naïve and optimized versions and discuss their impact on different collective algorithms. We observe performance improvement compared to the naïve version on a repetitive benchmark, up to a 3x speedup for the reduce collective.","PeriodicalId":176723,"journal":{"name":"Proceedings of the 27th European MPI Users' Group Meeting","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Implementation and performance evaluation of MPI persistent collectives in MPC: a case study\",\"authors\":\"Stéphane Bouhrour, Julien Jaeger\",\"doi\":\"10.1145/3416315.3416321\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Persistent collective communications have recently been voted in the MPI standard, opening the door to many optimizations to reduce collectives cost, in particular for recurring operations. Indeed persistent semantics contains an initialization phase called only once for a specific collective. It can be used to collect building costs necessary to the collective, to avoid paying them each time the operation is performed. We propose an overview of the implementation of the persistent collectives in the MPC MPI runtime. We first present a naïve implementation for MPI runtimes already providing nonblocking collectives. Then, we improve this first implementation with two levels of caching optimizations. We present the performance results of the naïve and optimized versions and discuss their impact on different collective algorithms. We observe performance improvement compared to the naïve version on a repetitive benchmark, up to a 3x speedup for the reduce collective.\",\"PeriodicalId\":176723,\"journal\":{\"name\":\"Proceedings of the 27th European MPI Users' Group Meeting\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th European MPI Users' Group Meeting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3416315.3416321\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3416315.3416321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

持久集体通信最近已被纳入MPI标准，这为许多优化打开了大门，以降低集体成本，特别是对于重复性操作。实际上，持久语义包含一个初始化阶段，对于特定的集合只调用一次。它可以用来收集集体所需的建筑成本，以避免每次执行操作时支付费用。我们概述了MPC MPI运行时中持久集合的实现。我们首先为已经提供非阻塞集合的MPI运行时提供naïve实现。然后，我们通过两个级别的缓存优化来改进第一个实现。我们给出了naïve和优化版本的性能结果，并讨论了它们对不同集合算法的影响。在重复的基准测试中，我们观察到与naïve版本相比，性能有所提高，减少集合的速度提高了3倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Implementation and performance evaluation of MPI persistent collectives in MPC: a case study

Persistent collective communications have recently been voted in the MPI standard, opening the door to many optimizations to reduce collectives cost, in particular for recurring operations. Indeed persistent semantics contains an initialization phase called only once for a specific collective. It can be used to collect building costs necessary to the collective, to avoid paying them each time the operation is performed. We propose an overview of the implementation of the persistent collectives in the MPC MPI runtime. We first present a naïve implementation for MPI runtimes already providing nonblocking collectives. Then, we improve this first implementation with two levels of caching optimizations. We present the performance results of the naïve and optimized versions and discuss their impact on different collective algorithms. We observe performance improvement compared to the naïve version on a repetitive benchmark, up to a 3x speedup for the reduce collective.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 27th European MPI Users' Group Meeting

自引率

0.00%

发文量