一种高效的gpu加速自适应网格细化框架,用于高保真可压缩反应流建模

IF 3.4 2区 物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Yuqi Wang , Yadong Zeng , Ralf Deiterding , Jianhan Liang
{"title":"一种高效的gpu加速自适应网格细化框架,用于高保真可压缩反应流建模","authors":"Yuqi Wang ,&nbsp;Yadong Zeng ,&nbsp;Ralf Deiterding ,&nbsp;Jianhan Liang","doi":"10.1016/j.cpc.2025.109870","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"318 ","pages":"Article 109870"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient GPU-accelerated adaptive mesh refinement framework for high-fidelity compressible reactive flows modeling\",\"authors\":\"Yuqi Wang ,&nbsp;Yadong Zeng ,&nbsp;Ralf Deiterding ,&nbsp;Jianhan Liang\",\"doi\":\"10.1016/j.cpc.2025.109870\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.</div></div>\",\"PeriodicalId\":285,\"journal\":{\"name\":\"Computer Physics Communications\",\"volume\":\"318 \",\"pages\":\"Article 109870\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Physics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010465525003728\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525003728","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种异构自适应网格细化(AMR)框架,用于百亿亿次模拟非刚性/中等刚性化学动力学。该框架具有高效的时间子循环步进算法以及专门的回流方法,所有这些都统一在高度并行,可扩展的代码库中。此外,我们开发了一种gpu优化的低存储显式龙格-库塔化学积分器,旨在最大限度地减少寄存器的使用,在高速燃烧问题中具有小机制尺寸的详细化学动力学方面获得比隐式同类产品更高的效率。一组基准测试证明了该框架在均匀网格和自适应精细网格上的高保真度。通过利用我们在AMReX之上开发的并行化策略,我们展示了在相同的代码库中,使用NVIDIA V100 GPU比使用Intel i9 CPU在各种问题上有显著的加速。特别是对于具有复杂物理和时空分布刚度的问题,如氢爆轰传播,我们实现了6.49倍的总体加速,并具有可观的计算吞吐量。最后,将该框架应用于大尺度三维直接数值模拟。与之前在统一网格上的CPU计算相比,它大大减少了计算中涉及的总自由度,而不影响准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An efficient GPU-accelerated adaptive mesh refinement framework for high-fidelity compressible reactive flows modeling
This paper presents a heterogeneous adaptive mesh refinement (AMR) framework for exascale simulations of non-stiff/moderately stiff chemical kinetics. The framework features an efficient time-subcycling stepping algorithm along with a specialized refluxing method, all unified in a highly parallel, scalable codebase. In addition, we develope a GPU-optimized low-storage explicit Runge–Kutta chemical integrator designed to minimize register usage, achieving higher efficiency than its implicit counterparts for detailed chemical kinetics with small mechanism size in high-speed combustion problems. A suite of benchmarks demonstrates the framework's high fidelity for both non-reactive and reactive simulations on both uniform and adaptively refined grids. By leveraging our parallelization strategy developed on top of AMReX, we demonstrate significant speedups on various problems using an NVIDIA V100 GPU compared to an Intel i9 CPU within the same codebase. In particular, for problems with complex physics and spatiotemporally distributed stiffness, such as hydrogen detonation propagation, we achieve an overall speedup of 6.49× with substantial computational throughput. Finally, this AMR framework is applied to a large-scale three-dimensional direct numerical simulation. Compared to prior CPU computations on a uniform grid, it yields a substantial reduction in total degrees of freedom involved in the calculation without compromising accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Physics Communications
Computer Physics Communications 物理-计算机:跨学科应用
CiteScore
12.10
自引率
3.20%
发文量
287
审稿时长
5.3 months
期刊介绍: The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信