MPI分离-异步本地完成

Proceedings of the 27th European MPI Users' Group Meeting Pub Date : 2020-09-21 DOI:10.1145/3416315.3416323

Joachim Protze, Marc-André Hermanns, A. C. Demiralp, Matthias S. Müller, T. Kuhlen

{"title":"MPI分离-异步本地完成","authors":"Joachim Protze, Marc-André Hermanns, A. C. Demiralp, Matthias S. Müller, T. Kuhlen","doi":"10.1145/3416315.3416323","DOIUrl":null,"url":null,"abstract":"When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++.","PeriodicalId":176723,"journal":{"name":"Proceedings of the 27th European MPI Users' Group Meeting","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"MPI Detach - Asynchronous Local Completion\",\"authors\":\"Joachim Protze, Marc-André Hermanns, A. C. Demiralp, Matthias S. Müller, T. Kuhlen\",\"doi\":\"10.1145/3416315.3416323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++.\",\"PeriodicalId\":176723,\"journal\":{\"name\":\"Proceedings of the 27th European MPI Users' Group Meeting\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th European MPI Users' Group Meeting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3416315.3416323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3416315.3416323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

当以大规模并行计算为目标时，由网络延迟、同步和负载不平衡引起的等待时间是高并行效率的主要对手。通过计算隐藏延迟的一种常用方法是使用非阻塞通信。在始终存在负载不平衡的情况下，同步成本只是负载不平衡的可见症状。OpenMP、TBB、omps或c++ 20协同程序中的任务处理方法承诺提供更高程度的并发性，这种并发性可以分布在可用的执行单元上，并显著提高负载平衡。可用的MPI非阻塞功能不能无缝地集成到这种任务并行化中。在这项工作中，我们提出了MPI接口的一个精简扩展，以允许将非阻塞通信与OpenMP和c++中可用的异步执行概念无缝集成。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MPI Detach - Asynchronous Local Completion

When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 27th European MPI Users' Group Meeting

自引率

0.00%

发文量