{"title":"GPU-Enabled Scalable Multiscale Solver for Reservoir Simulation","authors":"A. Manea","doi":"10.2523/iptc-22024-ms","DOIUrl":null,"url":null,"abstract":"\n As reservoir simulation models continue to grow in their size and complexity, the computational cost of reservoir simulation is constantly increasing. Since most of the reservoir simulation time is typically spent in the linear solver (being in the innermost part and the most challenging to parallelize and scale), efficient linear solvers are of utmost importance for reducing reservoir simulation turnaround times. In this work, we study the scalability of a versatile multiscale linear solver, namely the restriction-smoothed basis multiscale method (MsRSB) (Møyner and Lie (2016)) on the emerging massively parallel GPU architecture, and compare it to its performance on the multi-core CPU architecture.\n MsRSB, unlike traditional multiscale approaches, uses iterative smoothing to adaptively compute multiscale basis functions, allowing it to handle a wide range of difficult grid orientations seen in real-world industrial applications. While MsRSB can be parallelized directly, its reliance on a smoother to determine the basis of functions results in unusual control and data flow patterns. To achieve effective scalability, these patterns must be carefully designed and implemented on massively parallel systems. We extend Manea et al. (2016) and Manea and Almani (2019) work on parallel multiscale methods to move the MsRSB special kernels to shared-memory parallel multi-core and GPU architectures.\n Highly heterogeneous multimillion-cell 3D problems, adopted from the SPE10 Benchmark (Christie and Blunt (2001) have been used to illustrate the scalability of our parallel MsRSB development. The GPU implementation is benchmarked on a massively parallel architecture consisting of Nvidia Volta V100 GPUs, while the multi-core implementation is benchmarked on a shared memory multi-core architecture consisting of two packages of Intel's Haswell-EP Xeon(R) CPU E5-2667. For both the setup and solution stages, we compare the multi-core implementation versus the GPU implementation. The GPU-based MsRSB implementation shows great scalability, with over a 4-fold reduction in runtime when compared to the optimized multi-core implementation.","PeriodicalId":11027,"journal":{"name":"Day 3 Wed, February 23, 2022","volume":"52 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, February 23, 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2523/iptc-22024-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As reservoir simulation models continue to grow in their size and complexity, the computational cost of reservoir simulation is constantly increasing. Since most of the reservoir simulation time is typically spent in the linear solver (being in the innermost part and the most challenging to parallelize and scale), efficient linear solvers are of utmost importance for reducing reservoir simulation turnaround times. In this work, we study the scalability of a versatile multiscale linear solver, namely the restriction-smoothed basis multiscale method (MsRSB) (Møyner and Lie (2016)) on the emerging massively parallel GPU architecture, and compare it to its performance on the multi-core CPU architecture.
MsRSB, unlike traditional multiscale approaches, uses iterative smoothing to adaptively compute multiscale basis functions, allowing it to handle a wide range of difficult grid orientations seen in real-world industrial applications. While MsRSB can be parallelized directly, its reliance on a smoother to determine the basis of functions results in unusual control and data flow patterns. To achieve effective scalability, these patterns must be carefully designed and implemented on massively parallel systems. We extend Manea et al. (2016) and Manea and Almani (2019) work on parallel multiscale methods to move the MsRSB special kernels to shared-memory parallel multi-core and GPU architectures.
Highly heterogeneous multimillion-cell 3D problems, adopted from the SPE10 Benchmark (Christie and Blunt (2001) have been used to illustrate the scalability of our parallel MsRSB development. The GPU implementation is benchmarked on a massively parallel architecture consisting of Nvidia Volta V100 GPUs, while the multi-core implementation is benchmarked on a shared memory multi-core architecture consisting of two packages of Intel's Haswell-EP Xeon(R) CPU E5-2667. For both the setup and solution stages, we compare the multi-core implementation versus the GPU implementation. The GPU-based MsRSB implementation shows great scalability, with over a 4-fold reduction in runtime when compared to the optimized multi-core implementation.
随着油藏模拟模型规模和复杂度的不断增长,油藏模拟的计算成本也在不断增加。由于大多数油藏模拟时间通常花在线性求解器上(在最内部,最难以并行化和缩放),因此高效的线性求解器对于减少油藏模拟周转时间至关重要。在这项工作中,我们研究了一种通用多尺度线性求解器,即限制平滑基多尺度方法(MsRSB) (Møyner和Lie(2016))在新兴的大规模并行GPU架构上的可扩展性,并将其与多核CPU架构上的性能进行了比较。与传统的多尺度方法不同,MsRSB使用迭代平滑来自适应计算多尺度基函数,使其能够处理实际工业应用中出现的各种困难网格方向。虽然MsRSB可以直接并行化,但它依赖于平滑器来确定函数的基础,从而导致不寻常的控制和数据流模式。为了实现有效的可伸缩性,必须在大规模并行系统上仔细设计和实现这些模式。我们扩展了Manea等人(2016)和Manea和Almani(2019)对并行多尺度方法的研究,将MsRSB特殊内核移动到共享内存并行多核和GPU架构。采用SPE10基准(Christie and Blunt(2001))的高度异构的百万单元3D问题已被用来说明我们并行MsRSB开发的可扩展性。GPU实现是在由Nvidia Volta V100 GPU组成的大规模并行架构上进行基准测试的,而多核实现是在由两个英特尔Haswell-EP Xeon(R) CPU E5-2667组成的共享内存多核架构上进行基准测试的。对于设置和解决方案阶段,我们比较了多核实现与GPU实现。基于gpu的MsRSB实现显示出极大的可扩展性,与优化的多核实现相比,运行时减少了4倍以上。