A Visual Guide to MPI All-to-all

Nick Netterville, Ke Fan, Sidharth Kumar, Thomas Gilray
{"title":"A Visual Guide to MPI All-to-all","authors":"Nick Netterville, Ke Fan, Sidharth Kumar, Thomas Gilray","doi":"10.1109/HiPCW57629.2022.00008","DOIUrl":null,"url":null,"abstract":"The standard implementation of MPI_Alltoall in MPI libraries (e.g., MPICH, Open-MPI) uses a combination of techniques, such as the spread-out and Bruck algorithms. The spread-out algorithm uses a linear number iterations, in process count $P$, while the Bruck algorithm is logarithmic. The Bruck algorithm transfers more data overall, but with fewer communication steps, and is thus better suited for smaller sized (latency-dominated) messages. MPI implementations dynamically choose the underlying algorithm to use depending upon process count and message size. We have created an easy-to-use, parameterized, interactive web-based visualization that shows the implementation details of both the linear-step spread-out algorithm and the log-step Bruck algorithm, along with the decision tree used to choose between these two algorithms. Our tool visually illustrates and animates the two algorithms, pointing out key differences such as number of iterations, communication pattern and whether they are in-place.","PeriodicalId":432185,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPCW57629.2022.00008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The standard implementation of MPI_Alltoall in MPI libraries (e.g., MPICH, Open-MPI) uses a combination of techniques, such as the spread-out and Bruck algorithms. The spread-out algorithm uses a linear number iterations, in process count $P$, while the Bruck algorithm is logarithmic. The Bruck algorithm transfers more data overall, but with fewer communication steps, and is thus better suited for smaller sized (latency-dominated) messages. MPI implementations dynamically choose the underlying algorithm to use depending upon process count and message size. We have created an easy-to-use, parameterized, interactive web-based visualization that shows the implementation details of both the linear-step spread-out algorithm and the log-step Bruck algorithm, along with the decision tree used to choose between these two algorithms. Our tool visually illustrates and animates the two algorithms, pointing out key differences such as number of iterations, communication pattern and whether they are in-place.
MPI全方位视觉指南
MPI库(例如MPICH, Open-MPI)中MPI_Alltoall的标准实现使用了多种技术的组合,例如展开和Bruck算法。展开算法使用线性迭代数,在进程计数$P$,而Bruck算法是对数的。Bruck算法总体上传输了更多的数据,但通信步骤更少,因此更适合较小规模(延迟为主)的消息。MPI实现根据进程数和消息大小动态选择要使用的底层算法。我们创建了一个易于使用的,参数化的,交互式的基于web的可视化,显示了线性步长展开算法和对数步长Bruck算法的实现细节,以及用于在这两种算法之间进行选择的决策树。我们的工具直观地说明了这两种算法,并将其动画化,指出了关键的差异,例如迭代次数、通信模式以及它们是否到位。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信