An Adaptive Coloring Scheme for Graphics Processing Unit Preconditioners

Day 1 Tue, March 28, 2023 Pub Date : 2023-03-21 DOI:10.2118/212248-ms

Christopher Lemon, H. Cao, M. D. E. Szyndel, Eduard Khramchenkov

{"title":"An Adaptive Coloring Scheme for Graphics Processing Unit Preconditioners","authors":"Christopher Lemon, H. Cao, M. D. E. Szyndel, Eduard Khramchenkov","doi":"10.2118/212248-ms","DOIUrl":null,"url":null,"abstract":"\n A single modern graphics processing unit (GPU) typically has the memory bandwidth equivalent to many central processing unit (CPU) nodes. This makes GPU hardware appealing for linear solvers that tend to require high memory bandwidth and fast inter-core communication. Reservoir simulators are designed to handle a wide range of simulation models, and to obtain peak performance the linear solver must be well suited to the resulting linear systems. This fact can lead to disappointing performance when shifting the linear solver from CPU to GPU. To fully utilize the capabilities of the latest GPU devices, we must transition from coarse-grained to fine-grained parallel preconditioners. To enable such high levels of parallelism in the linear solver a common approach is to employ a multicolor reordering of the linear system. Depending on the specific properties of the simulation model, this process can cause a significant weakening of the parallel preconditioner, resulting in much slower convergence. In some situations, this slow convergence can cause an order of magnitude increase in the linear iteration count, and result in the GPU linear solver performing worse than the CPU version. In this paper we analyze the impact on performance of employing different coloring schemes for different simulation models and we identify how the coloring can be automatically adapted for the properties of each simulation model. In this way, the performance improvements expected on the GPU can be realized for a wider range of simulations.","PeriodicalId":225811,"journal":{"name":"Day 1 Tue, March 28, 2023","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Tue, March 28, 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/212248-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A single modern graphics processing unit (GPU) typically has the memory bandwidth equivalent to many central processing unit (CPU) nodes. This makes GPU hardware appealing for linear solvers that tend to require high memory bandwidth and fast inter-core communication. Reservoir simulators are designed to handle a wide range of simulation models, and to obtain peak performance the linear solver must be well suited to the resulting linear systems. This fact can lead to disappointing performance when shifting the linear solver from CPU to GPU. To fully utilize the capabilities of the latest GPU devices, we must transition from coarse-grained to fine-grained parallel preconditioners. To enable such high levels of parallelism in the linear solver a common approach is to employ a multicolor reordering of the linear system. Depending on the specific properties of the simulation model, this process can cause a significant weakening of the parallel preconditioner, resulting in much slower convergence. In some situations, this slow convergence can cause an order of magnitude increase in the linear iteration count, and result in the GPU linear solver performing worse than the CPU version. In this paper we analyze the impact on performance of employing different coloring schemes for different simulation models and we identify how the coloring can be automatically adapted for the properties of each simulation model. In this way, the performance improvements expected on the GPU can be realized for a wider range of simulations.

查看原文本刊更多论文

图形处理单元预处理器的自适应着色方案

单个现代图形处理单元(GPU)通常具有相当于许多中央处理单元(CPU)节点的内存带宽。这使得GPU硬件对于需要高内存带宽和快速核间通信的线性求解器具有吸引力。水库模拟器设计用于处理各种模拟模型，为了获得峰值性能，线性求解器必须非常适合所得到的线性系统。当将线性求解器从CPU转移到GPU时，这一事实可能导致令人失望的性能。为了充分利用最新GPU设备的功能，我们必须从粗粒度过渡到细粒度并行预处理。为了在线性求解器中实现如此高水平的并行性，一种常见的方法是采用线性系统的多色重新排序。根据仿真模型的具体特性，这一过程可能导致并行前置条件的显著减弱，从而导致收敛速度慢得多。在某些情况下，这种缓慢的收敛可能会导致线性迭代计数的数量级增加，并导致GPU线性求解器的性能比CPU版本差。在本文中，我们分析了对不同仿真模型采用不同着色方案对性能的影响，并确定了如何根据每个仿真模型的属性自动适应着色。通过这种方式，可以在更大范围的模拟中实现GPU上预期的性能改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Day 1 Tue, March 28, 2023

自引率

0.00%

发文量