Partition-Aware Routing to Improve Network Isolation in Infiniband Based Multi-tenant Clusters

Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie
{"title":"Partition-Aware Routing to Improve Network Isolation in Infiniband Based Multi-tenant Clusters","authors":"Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie","doi":"10.1109/CCGrid.2015.96","DOIUrl":null,"url":null,"abstract":"InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, isolation of nodes is provided through partitioning. The routing algorithm, however, is unaware of these partitions in the network, Traffic flows belonging to different partitions might share links inside the network fabric. This sharing of intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments like a cloud. In such systems, each tenant should experience predictable network performance, unaffected by the workload of other tenants. In addition, using current routing schemes, routes crossing partition boundaries are considered when distributing routes onto links in the network, despite the fact that these routes will never be used. The result is degraded load-balancing. In this paper, we present a novel partition-aware fat-tree routing algorithm, pFTree. The pFTree algorithm utilizes several mechanisms to provide network-wide isolation of partitions belonging to different tenant groups. Given the available network resources, pFTree starts by isolating partitions at the physical link level, and then moves on to utilize virtual lanes, if needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the de facto standard IB fat-tree routing algorithm.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"28 1","pages":"189-198"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2015.96","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, isolation of nodes is provided through partitioning. The routing algorithm, however, is unaware of these partitions in the network, Traffic flows belonging to different partitions might share links inside the network fabric. This sharing of intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments like a cloud. In such systems, each tenant should experience predictable network performance, unaffected by the workload of other tenants. In addition, using current routing schemes, routes crossing partition boundaries are considered when distributing routes onto links in the network, despite the fact that these routes will never be used. The result is degraded load-balancing. In this paper, we present a novel partition-aware fat-tree routing algorithm, pFTree. The pFTree algorithm utilizes several mechanisms to provide network-wide isolation of partitions belonging to different tenant groups. Given the available network resources, pFTree starts by isolating partitions at the physical link level, and then moves on to utilize virtual lanes, if needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the de facto standard IB fat-tree routing algorithm.
分区感知路由改善Infiniband多租户集群中的网络隔离
IB (InfiniBand)是一种广泛应用于现代高性能计算系统的网络互连技术。在大型IB结构中,通过分区提供节点隔离。然而,路由算法不知道网络中的这些分区,属于不同分区的流量可能在网络结构中共享链接。这种中间链接的共享会产生干扰,在像云这样的多租户环境中避免这种干扰尤为重要。在这样的系统中,每个租户都应该体验到可预测的网络性能,不受其他租户工作负载的影响。此外,使用当前的路由方案,在将路由分配到网络中的链路时,会考虑跨越分区边界的路由,尽管这些路由永远不会被使用。其结果是负载均衡降级。本文提出了一种新的分区感知胖树路由算法pFTree。pFTree算法利用几种机制为属于不同租户组的分区提供网络范围的隔离。给定可用的网络资源,pFTree首先在物理链路级别隔离分区,然后在需要时继续利用虚拟通道。我们的实验和仿真表明,pFTree能够显著降低分区间干扰的影响,而没有任何额外的功能开销。此外,pFTree还在事实上的标准IB胖树路由算法上提供了改进的负载均衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信