gpu原生自适应网格细化及其在晶格玻尔兹曼模拟中的应用

IF 7.2 2区 物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Khodr Jaber , Ebenezer E. Essel , Pierre E. Sullivan
{"title":"gpu原生自适应网格细化及其在晶格玻尔兹曼模拟中的应用","authors":"Khodr Jaber ,&nbsp;Ebenezer E. Essel ,&nbsp;Pierre E. Sullivan","doi":"10.1016/j.cpc.2025.109543","DOIUrl":null,"url":null,"abstract":"<div><div>Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.</div><div>Link to repository: <span><span>https://github.com/KhodrJ/AGAL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"311 ","pages":"Article 109543"},"PeriodicalIF":7.2000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GPU-native adaptive mesh refinement with application to lattice Boltzmann simulations\",\"authors\":\"Khodr Jaber ,&nbsp;Ebenezer E. Essel ,&nbsp;Pierre E. Sullivan\",\"doi\":\"10.1016/j.cpc.2025.109543\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.</div><div>Link to repository: <span><span>https://github.com/KhodrJ/AGAL</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":285,\"journal\":{\"name\":\"Computer Physics Communications\",\"volume\":\"311 \",\"pages\":\"Article 109543\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-02-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Physics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010465525000463\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525000463","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

自适应网格细化(AMR)通过在关键区域提供高分辨率,同时允许在不需要精细细节的区域进行粗化,从而实现高效的流量计算。虽然早期的AMR软件包仅依赖于CPU并行化,但异构计算系统的广泛采用导致了gpu加速实现。在这些混合方法中,模拟数据通常驻留在GPU上,网格管理和适应只发生在CPU上,需要在它们之间频繁传输数据。更有效的策略是在GPU上调整和维护整个网格结构,消除这些传输。由于其固有的并行性,晶格玻尔兹曼方法在混合AMR框架中得到了广泛的应用。这项工作提出了一种gpu原生的AMR算法,使用基于块的八叉树森林方法,在二维和三维上作为开源的c++ /CUDA代码实现。该实现包括一个用于弱可压缩流的晶格玻尔兹曼求解器,尽管底层网格细化过程与任何在以细胞为中心的基于块的网格上操作的求解器兼容。盖驱动的空腔和流过方形圆柱体的基准测试验证了该算法在单精度和双精度下跨越多个速度集的有效性。在消费者级和数据中心级gpu上进行的测试证明了它在不同硬件平台上的通用性。链接到存储库:https://github.com/KhodrJ/AGAL。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GPU-native adaptive mesh refinement with application to lattice Boltzmann simulations
Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.
Link to repository: https://github.com/KhodrJ/AGAL.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Physics Communications
Computer Physics Communications 物理-计算机:跨学科应用
CiteScore
12.10
自引率
3.20%
发文量
287
审稿时长
5.3 months
期刊介绍: The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信