MPI全方位视觉指南

2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW) Pub Date : 2022-12-01 DOI:10.1109/HiPCW57629.2022.00008

Nick Netterville, Ke Fan, Sidharth Kumar, Thomas Gilray

{"title":"MPI全方位视觉指南","authors":"Nick Netterville, Ke Fan, Sidharth Kumar, Thomas Gilray","doi":"10.1109/HiPCW57629.2022.00008","DOIUrl":null,"url":null,"abstract":"The standard implementation of MPI_Alltoall in MPI libraries (e.g., MPICH, Open-MPI) uses a combination of techniques, such as the spread-out and Bruck algorithms. The spread-out algorithm uses a linear number iterations, in process count $P$, while the Bruck algorithm is logarithmic. The Bruck algorithm transfers more data overall, but with fewer communication steps, and is thus better suited for smaller sized (latency-dominated) messages. MPI implementations dynamically choose the underlying algorithm to use depending upon process count and message size. We have created an easy-to-use, parameterized, interactive web-based visualization that shows the implementation details of both the linear-step spread-out algorithm and the log-step Bruck algorithm, along with the decision tree used to choose between these two algorithms. Our tool visually illustrates and animates the two algorithms, pointing out key differences such as number of iterations, communication pattern and whether they are in-place.","PeriodicalId":432185,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Visual Guide to MPI All-to-all\",\"authors\":\"Nick Netterville, Ke Fan, Sidharth Kumar, Thomas Gilray\",\"doi\":\"10.1109/HiPCW57629.2022.00008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The standard implementation of MPI_Alltoall in MPI libraries (e.g., MPICH, Open-MPI) uses a combination of techniques, such as the spread-out and Bruck algorithms. The spread-out algorithm uses a linear number iterations, in process count $P$, while the Bruck algorithm is logarithmic. The Bruck algorithm transfers more data overall, but with fewer communication steps, and is thus better suited for smaller sized (latency-dominated) messages. MPI implementations dynamically choose the underlying algorithm to use depending upon process count and message size. We have created an easy-to-use, parameterized, interactive web-based visualization that shows the implementation details of both the linear-step spread-out algorithm and the log-step Bruck algorithm, along with the decision tree used to choose between these two algorithms. Our tool visually illustrates and animates the two algorithms, pointing out key differences such as number of iterations, communication pattern and whether they are in-place.\",\"PeriodicalId\":432185,\"journal\":{\"name\":\"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPCW57629.2022.00008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPCW57629.2022.00008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

MPI库(例如MPICH, Open-MPI)中MPI_Alltoall的标准实现使用了多种技术的组合，例如展开和Bruck算法。展开算法使用线性迭代数，在进程计数$P$，而Bruck算法是对数的。Bruck算法总体上传输了更多的数据，但通信步骤更少，因此更适合较小规模(延迟为主)的消息。MPI实现根据进程数和消息大小动态选择要使用的底层算法。我们创建了一个易于使用的，参数化的，交互式的基于web的可视化，显示了线性步长展开算法和对数步长Bruck算法的实现细节，以及用于在这两种算法之间进行选择的决策树。我们的工具直观地说明了这两种算法，并将其动画化，指出了关键的差异，例如迭代次数、通信模式以及它们是否到位。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Visual Guide to MPI All-to-all

The standard implementation of MPI_Alltoall in MPI libraries (e.g., MPICH, Open-MPI) uses a combination of techniques, such as the spread-out and Bruck algorithms. The spread-out algorithm uses a linear number iterations, in process count $P$, while the Bruck algorithm is logarithmic. The Bruck algorithm transfers more data overall, but with fewer communication steps, and is thus better suited for smaller sized (latency-dominated) messages. MPI implementations dynamically choose the underlying algorithm to use depending upon process count and message size. We have created an easy-to-use, parameterized, interactive web-based visualization that shows the implementation details of both the linear-step spread-out algorithm and the log-step Bruck algorithm, along with the decision tree used to choose between these two algorithms. Our tool visually illustrates and animates the two algorithms, pointing out key differences such as number of iterations, communication pattern and whether they are in-place.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)

自引率

0.00%

发文量