基于GPU架构的大规模并行水库模拟器

Day 1 Tue, October 26, 2021 Pub Date : 2021-10-19 DOI:10.2118/203918-ms

Usuf Middya, A. Manea, Alhubail Maitham Makki, Todd R. Ferguson, T. Byer, A. Dogru

{"title":"基于GPU架构的大规模并行水库模拟器","authors":"Usuf Middya, A. Manea, Alhubail Maitham Makki, Todd R. Ferguson, T. Byer, A. Dogru","doi":"10.2118/203918-ms","DOIUrl":null,"url":null,"abstract":"\n Reservoir simulation computational costs have been continuously growing due to high-resolution reservoir characterization, increasing model complexity, and uncertainty analysis workflows. Reducing simulation costs by upscaling is often necessary for operational requirements. Fast evolving HPC technologies offer opportunities to reduce cost without compromising fidelity.\n This work presents a novel in-house massively parallel full-physics reservoir simulator running on the emerging GPU architecture. Almost all the simulation kernels have been designed and implemented to honor the GPU SIMD programming paradigm. These kernels include physical property calculations, phase equilibrium computations, Jacobian construction, linear and nonlinear solvers, and wells. Novel techniques are devised in various kernels to expose enough parallelism to ensure that the control and data-flow patterns are well suited for the GPU environment. Mixed-precision computation is also employed when appropriate (e.g., in derivative calculation) to reduce computational costs without compromising the solution accuracy.\n The GPU implementation of the simulator is tested and benchmarked using various reservoir models, ranging from the synthetic SPE10 Benchmark (Christie & Blunt, 2001) to several industrial-scale models. These real field models range in size from tens of millions of cells to more than billion cells with black-oil and multicomponent compositional fluid. The GPU simulator is benchmarked on the IBM AC922 massively parallel architecture having tens of NVidia Volta V100 GPUs. To compare performance with CPU architectures, an optimized CPU implementation of the simulator is benchmarked on the IBM AC922 CPUs and on a cluster consisting of thousands of Intel's Haswell-EP Xeon® CPU E5-2680 v3. Detailed analysis of several numerical experiments comparing the simulator performance on the GPU and the CPU architectures is presented. In almost all of the cases, the analysis shows that the use of hardware acceleration offers substantial benefits in terms of wall time and power consumption.\n This novel in-house full-physics, black-oil and compositional reservoir simulator employs several novel techniques in various simulation kernels to ensure full utilization of the GPU resources. Detailed analysis is presented to highlight the simulator performance in terms of runtime reduction, parallel scalability and power savings.","PeriodicalId":11146,"journal":{"name":"Day 1 Tue, October 26, 2021","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Massively Parallel Reservoir Simulator on the GPU Architecture\",\"authors\":\"Usuf Middya, A. Manea, Alhubail Maitham Makki, Todd R. Ferguson, T. Byer, A. Dogru\",\"doi\":\"10.2118/203918-ms\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Reservoir simulation computational costs have been continuously growing due to high-resolution reservoir characterization, increasing model complexity, and uncertainty analysis workflows. Reducing simulation costs by upscaling is often necessary for operational requirements. Fast evolving HPC technologies offer opportunities to reduce cost without compromising fidelity.\\n This work presents a novel in-house massively parallel full-physics reservoir simulator running on the emerging GPU architecture. Almost all the simulation kernels have been designed and implemented to honor the GPU SIMD programming paradigm. These kernels include physical property calculations, phase equilibrium computations, Jacobian construction, linear and nonlinear solvers, and wells. Novel techniques are devised in various kernels to expose enough parallelism to ensure that the control and data-flow patterns are well suited for the GPU environment. Mixed-precision computation is also employed when appropriate (e.g., in derivative calculation) to reduce computational costs without compromising the solution accuracy.\\n The GPU implementation of the simulator is tested and benchmarked using various reservoir models, ranging from the synthetic SPE10 Benchmark (Christie & Blunt, 2001) to several industrial-scale models. These real field models range in size from tens of millions of cells to more than billion cells with black-oil and multicomponent compositional fluid. The GPU simulator is benchmarked on the IBM AC922 massively parallel architecture having tens of NVidia Volta V100 GPUs. To compare performance with CPU architectures, an optimized CPU implementation of the simulator is benchmarked on the IBM AC922 CPUs and on a cluster consisting of thousands of Intel's Haswell-EP Xeon® CPU E5-2680 v3. Detailed analysis of several numerical experiments comparing the simulator performance on the GPU and the CPU architectures is presented. In almost all of the cases, the analysis shows that the use of hardware acceleration offers substantial benefits in terms of wall time and power consumption.\\n This novel in-house full-physics, black-oil and compositional reservoir simulator employs several novel techniques in various simulation kernels to ensure full utilization of the GPU resources. Detailed analysis is presented to highlight the simulator performance in terms of runtime reduction, parallel scalability and power savings.\",\"PeriodicalId\":11146,\"journal\":{\"name\":\"Day 1 Tue, October 26, 2021\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 1 Tue, October 26, 2021\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/203918-ms\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Tue, October 26, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/203918-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

由于高分辨率油藏表征、不断增加的模型复杂性和不确定性分析工作流程，油藏模拟计算成本不断增长。通过升级来降低模拟成本通常是满足操作需求所必需的。快速发展的高性能计算技术提供了在不影响保真度的情况下降低成本的机会。这项工作提出了一个新颖的内部大规模并行全物理油藏模拟器运行在新兴的GPU架构。几乎所有的仿真内核都是为GPU SIMD编程范式而设计和实现的。这些核心包括物理性质计算、相平衡计算、雅可比矩阵构造、线性和非线性求解器以及井。在各种内核中设计了新的技术来暴露足够的并行性，以确保控制和数据流模式非常适合GPU环境。在适当的情况下(例如，在导数计算中)也采用混合精度计算来减少计算成本而不影响解的精度。模拟器的GPU实现使用各种储层模型进行测试和基准测试，范围从合成SPE10基准(Christie & Blunt, 2001)到几个工业规模模型。这些实际油田模型的大小范围从数千万到超过10亿个细胞，其中含有黑油和多组分组成流体。GPU模拟器在IBM AC922大规模并行架构上进行基准测试，该架构拥有数十个NVidia Volta V100 GPU。为了与CPU架构进行性能比较，模拟器的优化CPU实现在IBM AC922 CPU和由数千个英特尔Haswell-EP Xeon®CPU E5-2680 v3组成的集群上进行基准测试。并对仿真器在GPU和CPU架构下的性能进行了数值比较。在几乎所有的情况下，分析表明，使用硬件加速在运行时间和功耗方面提供了实质性的好处。这个新颖的内部全物理、黑油和成分油藏模拟器在各种模拟内核中采用了几种新颖的技术，以确保GPU资源的充分利用。详细分析了该仿真器在减少运行时间、并行可扩展性和节能方面的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Massively Parallel Reservoir Simulator on the GPU Architecture

Reservoir simulation computational costs have been continuously growing due to high-resolution reservoir characterization, increasing model complexity, and uncertainty analysis workflows. Reducing simulation costs by upscaling is often necessary for operational requirements. Fast evolving HPC technologies offer opportunities to reduce cost without compromising fidelity. This work presents a novel in-house massively parallel full-physics reservoir simulator running on the emerging GPU architecture. Almost all the simulation kernels have been designed and implemented to honor the GPU SIMD programming paradigm. These kernels include physical property calculations, phase equilibrium computations, Jacobian construction, linear and nonlinear solvers, and wells. Novel techniques are devised in various kernels to expose enough parallelism to ensure that the control and data-flow patterns are well suited for the GPU environment. Mixed-precision computation is also employed when appropriate (e.g., in derivative calculation) to reduce computational costs without compromising the solution accuracy. The GPU implementation of the simulator is tested and benchmarked using various reservoir models, ranging from the synthetic SPE10 Benchmark (Christie & Blunt, 2001) to several industrial-scale models. These real field models range in size from tens of millions of cells to more than billion cells with black-oil and multicomponent compositional fluid. The GPU simulator is benchmarked on the IBM AC922 massively parallel architecture having tens of NVidia Volta V100 GPUs. To compare performance with CPU architectures, an optimized CPU implementation of the simulator is benchmarked on the IBM AC922 CPUs and on a cluster consisting of thousands of Intel's Haswell-EP Xeon® CPU E5-2680 v3. Detailed analysis of several numerical experiments comparing the simulator performance on the GPU and the CPU architectures is presented. In almost all of the cases, the analysis shows that the use of hardware acceleration offers substantial benefits in terms of wall time and power consumption. This novel in-house full-physics, black-oil and compositional reservoir simulator employs several novel techniques in various simulation kernels to ensure full utilization of the GPU resources. Detailed analysis is presented to highlight the simulator performance in terms of runtime reduction, parallel scalability and power savings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Day 1 Tue, October 26, 2021

自引率

0.00%

发文量