Avoiding Synchronization to Accelerate a CFD Solver in GPU

2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2019-10-01 DOI:10.1109/SBAC-PAD.2019.00041

Ernesto Dufrechu, P. Ezzatti, G. Usera

{"title":"Avoiding Synchronization to Accelerate a CFD Solver in GPU","authors":"Ernesto Dufrechu, P. Ezzatti, G. Usera","doi":"10.1109/SBAC-PAD.2019.00041","DOIUrl":null,"url":null,"abstract":"The caffa3d.MBRi is an open source, GPU-aware, general purpose incompressible flow solver, aimed at providing a useful tool for numerical simulation of real world fluid flow problems that require both geometrical flexibility and parallel computation capabilities to afford tens and hundreds million cells simulations. At the core of this tool there are a number of linear solvers that can be selected according to the characteristics of the problem to solve. For band matrices, the most efficient linear solver included in caffa3d.MBRi is the Strongly Implicit Procedure (SIP) solver. The parallelization of this solver follows the hyper-planes strategy, where the computations in one hyper-plane bare no dependencies and can be executed in parallel, while the hyper-planes have to be processed sequentially. In this work, we analyze this strategy to reach an efficient GPU implementation of the SIP solver for the caffa3d.MBRi. In particular, we design and implement a self-scheduling procedure to avoid the overhead of CPU-GPU synchronization implied by the hyper-planes strategy, outperforming the standard GPU implementation of the SIP by approximately 2×.","PeriodicalId":214572,"journal":{"name":"2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD.2019.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The caffa3d.MBRi is an open source, GPU-aware, general purpose incompressible flow solver, aimed at providing a useful tool for numerical simulation of real world fluid flow problems that require both geometrical flexibility and parallel computation capabilities to afford tens and hundreds million cells simulations. At the core of this tool there are a number of linear solvers that can be selected according to the characteristics of the problem to solve. For band matrices, the most efficient linear solver included in caffa3d.MBRi is the Strongly Implicit Procedure (SIP) solver. The parallelization of this solver follows the hyper-planes strategy, where the computations in one hyper-plane bare no dependencies and can be executed in parallel, while the hyper-planes have to be processed sequentially. In this work, we analyze this strategy to reach an efficient GPU implementation of the SIP solver for the caffa3d.MBRi. In particular, we design and implement a self-scheduling procedure to avoid the overhead of CPU-GPU synchronization implied by the hyper-planes strategy, outperforming the standard GPU implementation of the SIP by approximately 2×.

查看原文本刊更多论文

避免同步加速GPU中的CFD求解器

caffa3d。MBRi是一个开源的、gpu感知的、通用的不可压缩流动求解器，旨在为现实世界流体流动问题的数值模拟提供一个有用的工具，这些问题需要几何灵活性和并行计算能力，以提供数千万和数亿个单元的模拟。该工具的核心是许多线性求解器，可以根据要解决的问题的特征进行选择。对于带矩阵，最有效的线性求解器包括在caffa3d。MBRi是强隐式过程(SIP)求解器。该求解器的并行化遵循超平面策略，其中一个超平面上的计算没有依赖关系，可以并行执行，而超平面必须顺序处理。在这项工作中，我们分析了这一策略，以达到高效的GPU实现的SIP求解器为caffa3d.MBRi。特别是，我们设计并实现了一个自调度过程，以避免超平面策略所隐含的CPU-GPU同步开销，其性能比SIP的标准GPU实现高出约2倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

自引率

0.00%

发文量