Efficient task placement and routing of nearest neighbor exchanges in dragonfly networks

B. Prisacari, G. Rodríguez, P. Heidelberger, Dong Chen, C. Minkenberg, T. Hoefler
{"title":"Efficient task placement and routing of nearest neighbor exchanges in dragonfly networks","authors":"B. Prisacari, G. Rodríguez, P. Heidelberger, Dong Chen, C. Minkenberg, T. Hoefler","doi":"10.1145/2600212.2600225","DOIUrl":null,"url":null,"abstract":"Dragonflies are recent network designs that are one of the most promising topologies for the Exascale effort due to their scalability and cost. While being able to achieve very high throughput under random uniform all-to-all traffic, this type of network can experience significant performance degradation for other common high performance computing workloads such as stencil (multi-dimensional nearest neighbor) patterns. Often, the lack of peak performance is caused by an insufficient understanding of the interaction between the workload and the network, and an insufficient understanding of how application specific task-to-node mapping strategies can serve as optimization vehicles.\n To address these issues, we propose a theoretical performance analysis framework that takes as inputs a network specification and a traffic demand matrix characterizing an arbitrary workload and is able to predict where bottlenecks will occur in the network and what their impact will be on the effective sustainable injection bandwidth. We then focus our analysis on a specific high-interest communication pattern, the multi-dimensional Cartesian nearest neighbor exchange, and provide analytic bounds (owing to bottlenecks in the remote links of the Dragonfly) on its expected performance across a multitude of possible mapping strategies.\n Finally, using a comprehensive set of simulations results, we validate the correctness of the theoretical approach and in the process address some misconceptions regarding Dragonfly network behavior and evaluation, (such as the choice of throughput maximization over workload completion time minimization as optimization objective) and the question of whether the standard notion of Dragonfly balance can be extended to workloads other than uniform random traffic.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2600212.2600225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 44

Abstract

Dragonflies are recent network designs that are one of the most promising topologies for the Exascale effort due to their scalability and cost. While being able to achieve very high throughput under random uniform all-to-all traffic, this type of network can experience significant performance degradation for other common high performance computing workloads such as stencil (multi-dimensional nearest neighbor) patterns. Often, the lack of peak performance is caused by an insufficient understanding of the interaction between the workload and the network, and an insufficient understanding of how application specific task-to-node mapping strategies can serve as optimization vehicles. To address these issues, we propose a theoretical performance analysis framework that takes as inputs a network specification and a traffic demand matrix characterizing an arbitrary workload and is able to predict where bottlenecks will occur in the network and what their impact will be on the effective sustainable injection bandwidth. We then focus our analysis on a specific high-interest communication pattern, the multi-dimensional Cartesian nearest neighbor exchange, and provide analytic bounds (owing to bottlenecks in the remote links of the Dragonfly) on its expected performance across a multitude of possible mapping strategies. Finally, using a comprehensive set of simulations results, we validate the correctness of the theoretical approach and in the process address some misconceptions regarding Dragonfly network behavior and evaluation, (such as the choice of throughput maximization over workload completion time minimization as optimization objective) and the question of whether the standard notion of Dragonfly balance can be extended to workloads other than uniform random traffic.
蜻蜓网络中最近邻交换的有效任务分配和路由
蜻蜓是最近的网络设计,由于其可扩展性和成本,它是Exascale工作中最有前途的拓扑之一。虽然这种类型的网络能够在随机统一的全对全流量下实现非常高的吞吐量,但对于其他常见的高性能计算工作负载(如stencil(多维最近邻)模式),这种类型的网络可能会出现显著的性能下降。通常,缺乏峰值性能是由于对工作负载和网络之间的交互理解不足,以及对应用程序特定的任务到节点映射策略如何作为优化工具的理解不足造成的。为了解决这些问题,我们提出了一个理论性能分析框架,该框架将网络规范和表征任意工作负载的流量需求矩阵作为输入,并能够预测网络中的瓶颈将在哪里发生,以及它们对有效可持续注入带宽的影响。然后,我们将分析重点放在特定的高利益通信模式上,即多维笛卡尔最近邻交换,并提供其跨多种可能映射策略的预期性能的分析界限(由于蜻蜓远程链路中的瓶颈)。最后,利用一组全面的模拟结果,我们验证了理论方法的正确性,并在此过程中解决了有关蜻蜓网络行为和评估的一些误解,(例如选择吞吐量最大化而不是工作量完成时间最小化作为优化目标),以及蜻蜓平衡的标准概念是否可以扩展到除均匀随机流量以外的工作负载的问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信