An Adaptive Coloring Scheme for Graphics Processing Unit Preconditioners

Christopher Lemon, H. Cao, M. D. E. Szyndel, Eduard Khramchenkov
{"title":"An Adaptive Coloring Scheme for Graphics Processing Unit Preconditioners","authors":"Christopher Lemon, H. Cao, M. D. E. Szyndel, Eduard Khramchenkov","doi":"10.2118/212248-ms","DOIUrl":null,"url":null,"abstract":"\n A single modern graphics processing unit (GPU) typically has the memory bandwidth equivalent to many central processing unit (CPU) nodes. This makes GPU hardware appealing for linear solvers that tend to require high memory bandwidth and fast inter-core communication. Reservoir simulators are designed to handle a wide range of simulation models, and to obtain peak performance the linear solver must be well suited to the resulting linear systems. This fact can lead to disappointing performance when shifting the linear solver from CPU to GPU. To fully utilize the capabilities of the latest GPU devices, we must transition from coarse-grained to fine-grained parallel preconditioners. To enable such high levels of parallelism in the linear solver a common approach is to employ a multicolor reordering of the linear system. Depending on the specific properties of the simulation model, this process can cause a significant weakening of the parallel preconditioner, resulting in much slower convergence. In some situations, this slow convergence can cause an order of magnitude increase in the linear iteration count, and result in the GPU linear solver performing worse than the CPU version. In this paper we analyze the impact on performance of employing different coloring schemes for different simulation models and we identify how the coloring can be automatically adapted for the properties of each simulation model. In this way, the performance improvements expected on the GPU can be realized for a wider range of simulations.","PeriodicalId":225811,"journal":{"name":"Day 1 Tue, March 28, 2023","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Tue, March 28, 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/212248-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A single modern graphics processing unit (GPU) typically has the memory bandwidth equivalent to many central processing unit (CPU) nodes. This makes GPU hardware appealing for linear solvers that tend to require high memory bandwidth and fast inter-core communication. Reservoir simulators are designed to handle a wide range of simulation models, and to obtain peak performance the linear solver must be well suited to the resulting linear systems. This fact can lead to disappointing performance when shifting the linear solver from CPU to GPU. To fully utilize the capabilities of the latest GPU devices, we must transition from coarse-grained to fine-grained parallel preconditioners. To enable such high levels of parallelism in the linear solver a common approach is to employ a multicolor reordering of the linear system. Depending on the specific properties of the simulation model, this process can cause a significant weakening of the parallel preconditioner, resulting in much slower convergence. In some situations, this slow convergence can cause an order of magnitude increase in the linear iteration count, and result in the GPU linear solver performing worse than the CPU version. In this paper we analyze the impact on performance of employing different coloring schemes for different simulation models and we identify how the coloring can be automatically adapted for the properties of each simulation model. In this way, the performance improvements expected on the GPU can be realized for a wider range of simulations.
图形处理单元预处理器的自适应着色方案
单个现代图形处理单元(GPU)通常具有相当于许多中央处理单元(CPU)节点的内存带宽。这使得GPU硬件对于需要高内存带宽和快速核间通信的线性求解器具有吸引力。水库模拟器设计用于处理各种模拟模型,为了获得峰值性能,线性求解器必须非常适合所得到的线性系统。当将线性求解器从CPU转移到GPU时,这一事实可能导致令人失望的性能。为了充分利用最新GPU设备的功能,我们必须从粗粒度过渡到细粒度并行预处理。为了在线性求解器中实现如此高水平的并行性,一种常见的方法是采用线性系统的多色重新排序。根据仿真模型的具体特性,这一过程可能导致并行前置条件的显著减弱,从而导致收敛速度慢得多。在某些情况下,这种缓慢的收敛可能会导致线性迭代计数的数量级增加,并导致GPU线性求解器的性能比CPU版本差。在本文中,我们分析了对不同仿真模型采用不同着色方案对性能的影响,并确定了如何根据每个仿真模型的属性自动适应着色。通过这种方式,可以在更大范围的模拟中实现GPU上预期的性能改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信