Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie
{"title":"基于加权胖树路由的Infini波段企业集群高效负载均衡算法","authors":"Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie","doi":"10.1109/PDP.2015.111","DOIUrl":null,"url":null,"abstract":"Infini Band (IB) has become a popular network interconnect for high performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way. In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"IM-31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in Infini Band Enterprise Clusters\",\"authors\":\"Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie\",\"doi\":\"10.1109/PDP.2015.111\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Infini Band (IB) has become a popular network interconnect for high performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way. In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.\",\"PeriodicalId\":285111,\"journal\":{\"name\":\"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"volume\":\"IM-31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDP.2015.111\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2015.111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
摘要
Infini Band (IB)已成为高性能计算(HPC)系统中流行的网络互连。许多大型的基于ib的HPC系统使用一些变体的胖树拓扑来利用胖树提供的有用属性。脂肪树路由算法是脂肪树拓扑中最有效的确定性路由算法之一。该算法确保分配给每条链路的路由数量在整个网络中均衡。然而,其负载平衡技术的一个问题是,它假设网络中的流量分布是均匀的。当将面向主要消耗大量数据的节点的路由分配给fabric中的共享链路,而替代链路未得到充分利用时,会导致网络吞吐量次优。此外,由于胖树算法根据索引顺序路由节点,因此以完全相同的方式连接的两个系统的性能可能会有所不同。在本文中,我们提出了一种新的脂肪树路由算法wFatTree,该算法考虑了节点的流量特征,使网络链路上的负载均衡更加均匀,并且具有可预测的网络性能。我们的实验和模拟表明,当使用wFatTree路由时,大型胖树安装的总网络吞吐量提高了60%。此外,wFatTree还可以用于对流向网络中关键节点的流量进行优先级排序。
A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in Infini Band Enterprise Clusters
Infini Band (IB) has become a popular network interconnect for high performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way. In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.