Improving the performance of Dragonfly networks through restrictive Proxy routing strategies

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Javier Navaridas, Jose A. Pascual
{"title":"Improving the performance of Dragonfly networks through restrictive Proxy routing strategies","authors":"Javier Navaridas,&nbsp;Jose A. Pascual","doi":"10.1016/j.comnet.2025.111334","DOIUrl":null,"url":null,"abstract":"<div><div>Dragonfly has become the network of choice for large-scale high-performance computing systems and, indeed, it dominates the top positions of supercomputer rankings. The reason for this is that it offers a sweet spot in terms of cost, simplicity, performance, fault-tolerance and power consumption. In this work, we propose a collection of routing strategies which restrict proxies to be adjacent to either the local or the remote router. This way, it features shorter paths than the standard Valiant routing. We carry out an extensive simulation-based evaluation to assess their performance. Our experiments found latency reductions of up to 76% and throughput improvements of up to 26% when compared with standard Valiant routing when using synthetic traffic from independent sources at different scales. Furthermore, when using realistic application-inspired workloads, we found the strategies required between 5% and 20% less time to perform communications. In general, we observe that selecting proxies that are adjacent to the sender is more beneficial than those adjacent to the destination because the latter tends to generate backpressure in the last level of the interconnect. Interestingly, we found that the most restrictive proxy routing strategies obtain the best results in all scenarios and show that counterintuitively, the lower the path diversity, the more balanced the use of network resources. Our study includes investigating the interplay between routing and Dragonfly parameters and provide optimal parameters for proxy-based routing algorithms. Finally, we discuss some practical considerations related to the deployment of our strategies.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"267 ","pages":"Article 111334"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625003019","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Dragonfly has become the network of choice for large-scale high-performance computing systems and, indeed, it dominates the top positions of supercomputer rankings. The reason for this is that it offers a sweet spot in terms of cost, simplicity, performance, fault-tolerance and power consumption. In this work, we propose a collection of routing strategies which restrict proxies to be adjacent to either the local or the remote router. This way, it features shorter paths than the standard Valiant routing. We carry out an extensive simulation-based evaluation to assess their performance. Our experiments found latency reductions of up to 76% and throughput improvements of up to 26% when compared with standard Valiant routing when using synthetic traffic from independent sources at different scales. Furthermore, when using realistic application-inspired workloads, we found the strategies required between 5% and 20% less time to perform communications. In general, we observe that selecting proxies that are adjacent to the sender is more beneficial than those adjacent to the destination because the latter tends to generate backpressure in the last level of the interconnect. Interestingly, we found that the most restrictive proxy routing strategies obtain the best results in all scenarios and show that counterintuitively, the lower the path diversity, the more balanced the use of network resources. Our study includes investigating the interplay between routing and Dragonfly parameters and provide optimal parameters for proxy-based routing algorithms. Finally, we discuss some practical considerations related to the deployment of our strategies.
通过限制性代理路由策略改进蜻蜓网络的性能
蜻蜓已经成为大规模高性能计算系统的首选网络,事实上,它在超级计算机排名中占据了领先地位。这样做的原因是,它在成本、简单性、性能、容错和功耗方面提供了一个最佳点。在这项工作中,我们提出了一组路由策略,这些策略限制代理与本地或远程路由器相邻。这样,它的特点是比标准Valiant路由更短的路径。我们进行了广泛的基于模拟的评估来评估它们的性能。我们的实验发现,当使用来自不同规模的独立来源的合成流量时,与标准Valiant路由相比,延迟减少了76%,吞吐量提高了26%。此外,当使用实际应用程序启发的工作负载时,我们发现这些策略执行通信所需的时间减少了5%到20%。一般来说,我们观察到选择与发送方相邻的代理比与目的地相邻的代理更有利,因为后者倾向于在互连的最后一级产生背压。有趣的是,我们发现限制最严格的代理路由策略在所有场景下都获得了最好的结果,并且与直觉相反,路径多样性越低,网络资源的使用越平衡。我们的研究包括研究路由和蜻蜓参数之间的相互作用,并为基于代理的路由算法提供最优参数。最后,我们讨论了与我们的战略部署有关的一些实际考虑。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Networks
Computer Networks 工程技术-电信学
CiteScore
10.80
自引率
3.60%
发文量
434
审稿时长
8.6 months
期刊介绍: Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信