Liang Yuan, Yunquan Zhang, Yuxin Tang, L. Rao, Xiangzheng Sun
{"title":"LogGPH: A Parallel Computational Model with Hierarchical Communication Awareness","authors":"Liang Yuan, Yunquan Zhang, Yuxin Tang, L. Rao, Xiangzheng Sun","doi":"10.1109/CSE.2010.40","DOIUrl":null,"url":null,"abstract":"In large-scale cluster systems, interconnecting thousands of computing nodes increase the complexity of the network topology. Nevertheless, few existing computational models consider the impact of hierarchical communication latencies and bandwidths caused by the network complexity. In this paper we propose a new parallel computational model called LogGPH with a new parameter H incorporated into the LogGP model to describe the communication hierarchy. Through predicting and analyzing the point-to-point and collective MPI_Allgather communication on two 100-Terascale supercomputers, the Dawning 5000A and the Deep Comp 7000, with the new model, it shows that the new model is more accurate than the LogGP model. The mean of absolute error of our model on point-to-point communications is 13%, but the value is 30% without the hierarchical communication consideration.","PeriodicalId":342688,"journal":{"name":"2010 13th IEEE International Conference on Computational Science and Engineering","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th IEEE International Conference on Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE.2010.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
In large-scale cluster systems, interconnecting thousands of computing nodes increase the complexity of the network topology. Nevertheless, few existing computational models consider the impact of hierarchical communication latencies and bandwidths caused by the network complexity. In this paper we propose a new parallel computational model called LogGPH with a new parameter H incorporated into the LogGP model to describe the communication hierarchy. Through predicting and analyzing the point-to-point and collective MPI_Allgather communication on two 100-Terascale supercomputers, the Dawning 5000A and the Deep Comp 7000, with the new model, it shows that the new model is more accurate than the LogGP model. The mean of absolute error of our model on point-to-point communications is 13%, but the value is 30% without the hierarchical communication consideration.