Multipath Load Balancing for M × N Communication Patterns on the Blue Gene/Q Supercomputer Interconnection Network

Huy Bui, R. Jacob, Preeti Malakar, V. Vishwanath, Andrew E. Johnson, M. Papka, J. Leigh
{"title":"Multipath Load Balancing for M × N Communication Patterns on the Blue Gene/Q Supercomputer Interconnection Network","authors":"Huy Bui, R. Jacob, Preeti Malakar, V. Vishwanath, Andrew E. Johnson, M. Papka, J. Leigh","doi":"10.1109/CLUSTER.2015.140","DOIUrl":null,"url":null,"abstract":"Achievable networking performance of applications in a supercomputer depends on the exact combination of the communication patterns of the applications and the routing algorithms used by the supercomputer. In order to achieve the highest networking performance for the applications the routing algorithms need to be designed optimally for those communication patterns. However, while communication patterns usually have a wide variation from application to application and even from phase to phase in an application, routing algorithms have a limited variation and usually are optimized for typical communication patterns. This results in high networking performance for favored communication patterns but low networking performance for others. In this paper we present approaches for improving networking performance by rebalancing load on physical links on the Blue Gene Q supercomputer. We realize our approaches in a framework called OPTIQ and demonstrate the efficacy of our framework via a set of benchmarks. Our results show that we can achieve 30% higher throughput on experiment with data and patterns from a real application. The improvement can be up to several times higher throughput than default MPI_Alltoallv used in the Blue Gene Q supercomputer for certain communication patterns.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2015.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Achievable networking performance of applications in a supercomputer depends on the exact combination of the communication patterns of the applications and the routing algorithms used by the supercomputer. In order to achieve the highest networking performance for the applications the routing algorithms need to be designed optimally for those communication patterns. However, while communication patterns usually have a wide variation from application to application and even from phase to phase in an application, routing algorithms have a limited variation and usually are optimized for typical communication patterns. This results in high networking performance for favored communication patterns but low networking performance for others. In this paper we present approaches for improving networking performance by rebalancing load on physical links on the Blue Gene Q supercomputer. We realize our approaches in a framework called OPTIQ and demonstrate the efficacy of our framework via a set of benchmarks. Our results show that we can achieve 30% higher throughput on experiment with data and patterns from a real application. The improvement can be up to several times higher throughput than default MPI_Alltoallv used in the Blue Gene Q supercomputer for certain communication patterns.
蓝基因/Q超级计算机互联网络中M × N通信模式的多径负载均衡
超级计算机中应用程序可实现的网络性能取决于应用程序的通信模式和超级计算机使用的路由算法的精确组合。为了使应用程序获得最高的网络性能,需要针对这些通信模式优化设计路由算法。然而,尽管通信模式在应用程序之间,甚至在应用程序的阶段之间通常有很大的变化,路由算法的变化有限,并且通常针对典型的通信模式进行了优化。这将导致某些受欢迎的通信模式具有较高的网络性能,而其他模式的网络性能较低。在本文中,我们提出了通过在Blue Gene Q超级计算机上重新平衡物理链路上的负载来提高网络性能的方法。我们在一个名为OPTIQ的框架中实现了我们的方法,并通过一组基准测试证明了我们框架的有效性。结果表明,在实际应用的数据和模式实验中,我们可以将吞吐量提高30%。对于某些通信模式,改进后的吞吐量可以比Blue Gene Q超级计算机中使用的默认MPI_Alltoallv高几倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信