Apurva Raj, Somnath C. Roy, N. Vydyanathan, Bharatkumar Sharma
{"title":"Acceleration of a 3D Immersed Boundary Solver Using OpenACC","authors":"Apurva Raj, Somnath C. Roy, N. Vydyanathan, Bharatkumar Sharma","doi":"10.1109/HIPCW.2018.8634138","DOIUrl":null,"url":null,"abstract":"Immersed-boundary methods (IBM) have been constantly gaining popularity and are increasingly expanding to new areas of applications in computational mechanics since last three decades due to the potentials of their application in modeling complex multiphysics phenomena which involves flow over complex and moving boundaries. The specific advantages of an immersed boundary method are due to its accuracy and simplicity. As this method uses a fixed structured Cartesian mesh, the complex grid generation processes can be fully avoided whereas the complex/moving boundary is described using another surface mesh. The computational overheads in an immersed boundary implementation can be very high due to expensive search and interpolation steps through which the effects of the boundary conditions on the surface mesh are translated to the fixed Cartesian volume mesh. Therefore, computationally efficient numerical implementation of an IBM solver is of extreme importance to researchers. This paper presents an accelerated discrete finite difference based immersed boundary (IB) solver that is used to study the external flow behavior around complex geometries. The flow is assumed to be incompressible. The immersed boundary solver is parallelized using OpenACC for quick acceleration with minimal code changes and to ensure performance portability across both GPUs and multicore CPUs. Our experimental results indicate that the OpenACC-based IB solver run on a NVIDIA Tesla P100 GPU is 21x faster than the sequential legacy solver and is 3.3x faster than the OpenACC-based IB solver run on a dual socket Intel Xeon Gold 6148, 20 core CPU. The recirculation lengths obtained for Reynolds numbers of 20 and 40 and the Strouhal number for Reynolds number 100, for a standard flow visualization problem over a fixed cylinder, are in accordance with the reported data in available literature, thereby validating the accuracy of the parallel solver. We also analyze the performance of the accelerated solver on different GPU architectures: Kepler, Pascal and Volta.","PeriodicalId":401060,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","volume":"194 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIPCW.2018.8634138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Immersed-boundary methods (IBM) have been constantly gaining popularity and are increasingly expanding to new areas of applications in computational mechanics since last three decades due to the potentials of their application in modeling complex multiphysics phenomena which involves flow over complex and moving boundaries. The specific advantages of an immersed boundary method are due to its accuracy and simplicity. As this method uses a fixed structured Cartesian mesh, the complex grid generation processes can be fully avoided whereas the complex/moving boundary is described using another surface mesh. The computational overheads in an immersed boundary implementation can be very high due to expensive search and interpolation steps through which the effects of the boundary conditions on the surface mesh are translated to the fixed Cartesian volume mesh. Therefore, computationally efficient numerical implementation of an IBM solver is of extreme importance to researchers. This paper presents an accelerated discrete finite difference based immersed boundary (IB) solver that is used to study the external flow behavior around complex geometries. The flow is assumed to be incompressible. The immersed boundary solver is parallelized using OpenACC for quick acceleration with minimal code changes and to ensure performance portability across both GPUs and multicore CPUs. Our experimental results indicate that the OpenACC-based IB solver run on a NVIDIA Tesla P100 GPU is 21x faster than the sequential legacy solver and is 3.3x faster than the OpenACC-based IB solver run on a dual socket Intel Xeon Gold 6148, 20 core CPU. The recirculation lengths obtained for Reynolds numbers of 20 and 40 and the Strouhal number for Reynolds number 100, for a standard flow visualization problem over a fixed cylinder, are in accordance with the reported data in available literature, thereby validating the accuracy of the parallel solver. We also analyze the performance of the accelerated solver on different GPU architectures: Kepler, Pascal and Volta.