{"title":"Performance comparison of Eulerian kinetic Vlasov code between flat-MPI parallelism and hybrid parallelism on Fujitsu FX100 supercomputer","authors":"T. Umeda, K. Fukazawa","doi":"10.1145/2966884.2966891","DOIUrl":null,"url":null,"abstract":"The present study deals with the Vlasov simulation code, which solves the first-principle kinetic equations called the Vlasov equation for space plasma. In the present study, a five-dimensional Vlasov code with two spatial dimension and three velocity dimensions is parallelized with two methods, the flat-MPI and the MPI-OpenMP hybrid parallelism. The two types of the parallel Vlasov code are benchmarked on massively-parallel supercomputer Fujitsu FX100, which has been developed with the second-generation post architecture of the K computer in Japan. In the present performance comparison, we vary the number of threads per nodes from 1 (the flat-MPI parallelism) to 32. The result shows that the OpenMP-MPI hybrid parallelism outperforms the flat-MPI for any number of compute nodes. There is an optimum number of threads per nodes depending on the number of compute nodes. It is shown that the optimum number of threads per node becomes larger on a larger number of compute nodes. This is because the communication time of an MPI collective communication subroutine, which is used for convergence check of iterative methods, can be reduced by decreasing the total number of processes with the OpenMP-MPI hybrid parallelism.","PeriodicalId":264069,"journal":{"name":"Proceedings of the 23rd European MPI Users' Group Meeting","volume":"35 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2966884.2966891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The present study deals with the Vlasov simulation code, which solves the first-principle kinetic equations called the Vlasov equation for space plasma. In the present study, a five-dimensional Vlasov code with two spatial dimension and three velocity dimensions is parallelized with two methods, the flat-MPI and the MPI-OpenMP hybrid parallelism. The two types of the parallel Vlasov code are benchmarked on massively-parallel supercomputer Fujitsu FX100, which has been developed with the second-generation post architecture of the K computer in Japan. In the present performance comparison, we vary the number of threads per nodes from 1 (the flat-MPI parallelism) to 32. The result shows that the OpenMP-MPI hybrid parallelism outperforms the flat-MPI for any number of compute nodes. There is an optimum number of threads per nodes depending on the number of compute nodes. It is shown that the optimum number of threads per node becomes larger on a larger number of compute nodes. This is because the communication time of an MPI collective communication subroutine, which is used for convergence check of iterative methods, can be reduced by decreasing the total number of processes with the OpenMP-MPI hybrid parallelism.