{"title":"MPI collective communication operations on large shared memory systems","authors":"M. Bernaschi, G. Richelli","doi":"10.1109/EMPDP.2001.905038","DOIUrl":null,"url":null,"abstract":"Collective communication performance is critical in a number of MPI applications yet relatively few results are available to assess the performance of MPI implementations specially for shared memory multiprocessors. In this paper we focus on the most widely used primitive, broadcast, and present experimental results for the Sun Enterprise 10000. We compare the performance of the Sun MPI primitives with our implementation based on a quasi-optimal algorithm. Our tests highlight advantages and drawbacks of vendors' implementations of collective communication primitives and suggest that the choice of the best algorithm may depend on exogenous factors like load balancing among tasks.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMPDP.2001.905038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Collective communication performance is critical in a number of MPI applications yet relatively few results are available to assess the performance of MPI implementations specially for shared memory multiprocessors. In this paper we focus on the most widely used primitive, broadcast, and present experimental results for the Sun Enterprise 10000. We compare the performance of the Sun MPI primitives with our implementation based on a quasi-optimal algorithm. Our tests highlight advantages and drawbacks of vendors' implementations of collective communication primitives and suggest that the choice of the best algorithm may depend on exogenous factors like load balancing among tasks.