{"title":"Implementation and performance analysis of hybrid MPI+OpenMP programming for parallel MLFMA on SMP cluster","authors":"Huailiang Xuan, W. Tong, Zhi-xun Gong, Youwen Lan","doi":"10.1109/ICICIP.2012.6391557","DOIUrl":null,"url":null,"abstract":"As multi-core CPUs are widely used in SMP clusters, parallel programming should pay more attention on shared memory parallelization inside single node. Hybrid MPI+OpenMP programming is naturally a good model that combines the distributed memory parallelization between nodes in clusters and the shared memory parallelization on each node. In this paper, we propose a parallel MLMFA (multilevel fast multipole algorithm) approach based on hybrid MPI+OpenMP model. Performance of hybrid implementation is studied compared with our previous pure MPI version. Time cost for computation and communication and memory consumption are analyzed in detail. As most modern HPC systems are clusters of SMP, the implementation is relevant.","PeriodicalId":376265,"journal":{"name":"2012 Third International Conference on Intelligent Control and Information Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Intelligent Control and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIP.2012.6391557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
As multi-core CPUs are widely used in SMP clusters, parallel programming should pay more attention on shared memory parallelization inside single node. Hybrid MPI+OpenMP programming is naturally a good model that combines the distributed memory parallelization between nodes in clusters and the shared memory parallelization on each node. In this paper, we propose a parallel MLMFA (multilevel fast multipole algorithm) approach based on hybrid MPI+OpenMP model. Performance of hybrid implementation is studied compared with our previous pure MPI version. Time cost for computation and communication and memory consumption are analyzed in detail. As most modern HPC systems are clusters of SMP, the implementation is relevant.