一种高效的gpu加速自适应网格细化框架，用于高保真可压缩反应流建模

IF 3.4 2区物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer Physics Communications Pub Date : 2025-09-22 DOI:10.1016/j.cpc.2025.109870

Yuqi Wang , Yadong Zeng , Ralf Deiterding , Jianhan Liang

{"title":"一种高效的gpu加速自适应网格细化框架，用于高保真可压缩反应流建模","authors":"Yuqi Wang , Yadong Zeng , Ralf Deiterding , Jianhan Liang","doi":"10.1016/j.cpc.2025.109870","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"318 ","pages":"Article 109870"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient GPU-accelerated adaptive mesh refinement framework for high-fidelity compressible reactive flows modeling\",\"authors\":\"Yuqi Wang , Yadong Zeng , Ralf Deiterding , Jianhan Liang\",\"doi\":\"10.1016/j.cpc.2025.109870\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.</div></div>\",\"PeriodicalId\":285,\"journal\":{\"name\":\"Computer Physics Communications\",\"volume\":\"318 \",\"pages\":\"Article 109870\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Physics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010465525003728\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525003728","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种异构自适应网格细化（AMR）框架，用于百亿亿次模拟非刚性/中等刚性化学动力学。该框架具有高效的时间子循环步进算法以及专门的回流方法，所有这些都统一在高度并行，可扩展的代码库中。此外，我们开发了一种gpu优化的低存储显式龙格-库塔化学积分器，旨在最大限度地减少寄存器的使用，在高速燃烧问题中具有小机制尺寸的详细化学动力学方面获得比隐式同类产品更高的效率。一组基准测试证明了该框架在均匀网格和自适应精细网格上的高保真度。通过利用我们在AMReX之上开发的并行化策略，我们展示了在相同的代码库中，使用NVIDIA V100 GPU比使用Intel i9 CPU在各种问题上有显著的加速。特别是对于具有复杂物理和时空分布刚度的问题，如氢爆轰传播，我们实现了6.49倍的总体加速，并具有可观的计算吞吐量。最后，将该框架应用于大尺度三维直接数值模拟。与之前在统一网格上的CPU计算相比，它大大减少了计算中涉及的总自由度，而不影响准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An efficient GPU-accelerated adaptive mesh refinement framework for high-fidelity compressible reactive flows modeling

This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Physics Communications 物理-计算机：跨学科应用

CiteScore

12.10

自引率

3.20%

发文量

287

审稿时长

5.3 months

期刊介绍： The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.