{"title":"SMP集群上并行MLFMA的MPI+OpenMP混合编程实现及性能分析","authors":"Huailiang Xuan, W. Tong, Zhi-xun Gong, Youwen Lan","doi":"10.1109/ICICIP.2012.6391557","DOIUrl":null,"url":null,"abstract":"As multi-core CPUs are widely used in SMP clusters, parallel programming should pay more attention on shared memory parallelization inside single node. Hybrid MPI+OpenMP programming is naturally a good model that combines the distributed memory parallelization between nodes in clusters and the shared memory parallelization on each node. In this paper, we propose a parallel MLMFA (multilevel fast multipole algorithm) approach based on hybrid MPI+OpenMP model. Performance of hybrid implementation is studied compared with our previous pure MPI version. Time cost for computation and communication and memory consumption are analyzed in detail. As most modern HPC systems are clusters of SMP, the implementation is relevant.","PeriodicalId":376265,"journal":{"name":"2012 Third International Conference on Intelligent Control and Information Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Implementation and performance analysis of hybrid MPI+OpenMP programming for parallel MLFMA on SMP cluster\",\"authors\":\"Huailiang Xuan, W. Tong, Zhi-xun Gong, Youwen Lan\",\"doi\":\"10.1109/ICICIP.2012.6391557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As multi-core CPUs are widely used in SMP clusters, parallel programming should pay more attention on shared memory parallelization inside single node. Hybrid MPI+OpenMP programming is naturally a good model that combines the distributed memory parallelization between nodes in clusters and the shared memory parallelization on each node. In this paper, we propose a parallel MLMFA (multilevel fast multipole algorithm) approach based on hybrid MPI+OpenMP model. Performance of hybrid implementation is studied compared with our previous pure MPI version. Time cost for computation and communication and memory consumption are analyzed in detail. As most modern HPC systems are clusters of SMP, the implementation is relevant.\",\"PeriodicalId\":376265,\"journal\":{\"name\":\"2012 Third International Conference on Intelligent Control and Information Processing\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Third International Conference on Intelligent Control and Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICIP.2012.6391557\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Intelligent Control and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIP.2012.6391557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
随着多核cpu在SMP集群中的广泛应用,并行编程应更加关注单节点内共享内存的并行化。混合MPI+OpenMP编程自然是一个很好的模型,它结合了集群中节点之间的分布式内存并行化和每个节点上的共享内存并行化。本文提出了一种基于MPI+OpenMP混合模型的并行MLMFA (multi - level fast multipole algorithm)方法。并与之前的纯MPI版本进行了性能对比研究。详细分析了计算时间、通信时间和内存消耗。由于大多数现代HPC系统都是SMP集群,因此实现是相关的。
Implementation and performance analysis of hybrid MPI+OpenMP programming for parallel MLFMA on SMP cluster
As multi-core CPUs are widely used in SMP clusters, parallel programming should pay more attention on shared memory parallelization inside single node. Hybrid MPI+OpenMP programming is naturally a good model that combines the distributed memory parallelization between nodes in clusters and the shared memory parallelization on each node. In this paper, we propose a parallel MLMFA (multilevel fast multipole algorithm) approach based on hybrid MPI+OpenMP model. Performance of hybrid implementation is studied compared with our previous pure MPI version. Time cost for computation and communication and memory consumption are analyzed in detail. As most modern HPC systems are clusters of SMP, the implementation is relevant.