Khodr Jaber , Ebenezer E. Essel , Pierre E. Sullivan
{"title":"GPU-native adaptive mesh refinement with application to lattice Boltzmann simulations","authors":"Khodr Jaber , Ebenezer E. Essel , Pierre E. Sullivan","doi":"10.1016/j.cpc.2025.109543","DOIUrl":null,"url":null,"abstract":"<div><div>Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.</div><div>Link to repository: <span><span>https://github.com/KhodrJ/AGAL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"311 ","pages":"Article 109543"},"PeriodicalIF":7.2000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525000463","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Adaptive Mesh Refinement (AMR) enables efficient computation of flows by providing high resolution in critical regions while allowing for coarsening in areas where fine detail is unnecessary. While early AMR software packages relied solely on CPU parallelization, the widespread adoption of heterogeneous computing systems has led to GPU-accelerated implementations. In these hybrid approaches, simulation data typically resides on the GPU, and mesh management and adaptation occur exclusively on the CPU, necessitating frequent data transfers between them. A more efficient strategy is to adapt and maintain the entire mesh structure exclusively on the GPU, eliminating these transfers. Because of its inherent parallelism, the Lattice Boltzmann Method (LBM) has been widely implemented in hybrid AMR frameworks. This work presents a GPU-native algorithm for AMR using a block-based forest of octrees approach, implemented in both two and three dimensions as open-source C++/CUDA code. The implementation includes a Lattice Boltzmann solver for weakly compressible flow, though the underlying grid refinement procedure is compatible with any solver operating on cell-centered block-based grids. The lid-driven cavity and flow past a square cylinder benchmarks validate the algorithm's effectiveness across multiple velocity sets in both single- and double-precision. Tests conducted on consumer and datacenter-grade GPUs demonstrate its versatility across different hardware platforms.
Link to repository: https://github.com/KhodrJ/AGAL.
期刊介绍:
The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper.
Computer Programs in Physics (CPiP)
These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged.
Computational Physics Papers (CP)
These are research papers in, but are not limited to, the following themes across computational physics and related disciplines.
mathematical and numerical methods and algorithms;
computational models including those associated with the design, control and analysis of experiments; and
algebraic computation.
Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.