{"title":"Improving the performance of Dragonfly networks through restrictive Proxy routing strategies","authors":"Javier Navaridas, Jose A. Pascual","doi":"10.1016/j.comnet.2025.111334","DOIUrl":null,"url":null,"abstract":"<div><div>Dragonfly has become the network of choice for large-scale high-performance computing systems and, indeed, it dominates the top positions of supercomputer rankings. The reason for this is that it offers a sweet spot in terms of cost, simplicity, performance, fault-tolerance and power consumption. In this work, we propose a collection of routing strategies which restrict proxies to be adjacent to either the local or the remote router. This way, it features shorter paths than the standard Valiant routing. We carry out an extensive simulation-based evaluation to assess their performance. Our experiments found latency reductions of up to 76% and throughput improvements of up to 26% when compared with standard Valiant routing when using synthetic traffic from independent sources at different scales. Furthermore, when using realistic application-inspired workloads, we found the strategies required between 5% and 20% less time to perform communications. In general, we observe that selecting proxies that are adjacent to the sender is more beneficial than those adjacent to the destination because the latter tends to generate backpressure in the last level of the interconnect. Interestingly, we found that the most restrictive proxy routing strategies obtain the best results in all scenarios and show that counterintuitively, the lower the path diversity, the more balanced the use of network resources. Our study includes investigating the interplay between routing and Dragonfly parameters and provide optimal parameters for proxy-based routing algorithms. Finally, we discuss some practical considerations related to the deployment of our strategies.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"267 ","pages":"Article 111334"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625003019","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Dragonfly has become the network of choice for large-scale high-performance computing systems and, indeed, it dominates the top positions of supercomputer rankings. The reason for this is that it offers a sweet spot in terms of cost, simplicity, performance, fault-tolerance and power consumption. In this work, we propose a collection of routing strategies which restrict proxies to be adjacent to either the local or the remote router. This way, it features shorter paths than the standard Valiant routing. We carry out an extensive simulation-based evaluation to assess their performance. Our experiments found latency reductions of up to 76% and throughput improvements of up to 26% when compared with standard Valiant routing when using synthetic traffic from independent sources at different scales. Furthermore, when using realistic application-inspired workloads, we found the strategies required between 5% and 20% less time to perform communications. In general, we observe that selecting proxies that are adjacent to the sender is more beneficial than those adjacent to the destination because the latter tends to generate backpressure in the last level of the interconnect. Interestingly, we found that the most restrictive proxy routing strategies obtain the best results in all scenarios and show that counterintuitively, the lower the path diversity, the more balanced the use of network resources. Our study includes investigating the interplay between routing and Dragonfly parameters and provide optimal parameters for proxy-based routing algorithms. Finally, we discuss some practical considerations related to the deployment of our strategies.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.