用于纳维-斯托克斯压力投影的按比例 ILU 平滑器

IF 1.7 4区 工程技术 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Stephen Thomas, Arielle Carr, Paul Mullowney, Katarzyna Świrydowicz, Marcus Day
{"title":"用于纳维-斯托克斯压力投影的按比例 ILU 平滑器","authors":"Stephen Thomas,&nbsp;Arielle Carr,&nbsp;Paul Mullowney,&nbsp;Katarzyna Świrydowicz,&nbsp;Marcus Day","doi":"10.1002/fld.5254","DOIUrl":null,"url":null,"abstract":"<p>Incomplete LU (ILU) smoothers are effective in the algebraic multigrid (AMG) <math>\n <semantics>\n <mrow>\n <mi>V</mi>\n </mrow>\n <annotation>$$ V $$</annotation>\n </semantics></math>-cycle for reducing high-frequency components of the error. However, the requisite direct triangular solves are comparatively slow on GPUs. Previous work has demonstrated the advantages of Jacobi iteration as an alternative to direct solution of these systems. Depending on the threshold and fill-level parameters chosen, the factors can be highly nonnormal and Jacobi is unlikely to converge in a low number of iterations. We demonstrate that row scaling can reduce the departure from normality, allowing us to replace the inherently sequential solve with a rapidly converging Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. Further, an ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases, and thus parallel strong-scaling is improved. Our algorithms have been incorporated into hypre, and we demonstrate improved time to solution for linear systems arising in the Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRES<math>\n <semantics>\n <mrow>\n <mo>+</mo>\n </mrow>\n <annotation>$$ + $$</annotation>\n </semantics></math>AMG executes at least five times faster when using iterative triangular solves compared with direct solves on massively parallel GPUs.</p>","PeriodicalId":50348,"journal":{"name":"International Journal for Numerical Methods in Fluids","volume":"96 4","pages":"537-560"},"PeriodicalIF":1.7000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scaled ILU smoothers for Navier–Stokes pressure projection\",\"authors\":\"Stephen Thomas,&nbsp;Arielle Carr,&nbsp;Paul Mullowney,&nbsp;Katarzyna Świrydowicz,&nbsp;Marcus Day\",\"doi\":\"10.1002/fld.5254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Incomplete LU (ILU) smoothers are effective in the algebraic multigrid (AMG) <math>\\n <semantics>\\n <mrow>\\n <mi>V</mi>\\n </mrow>\\n <annotation>$$ V $$</annotation>\\n </semantics></math>-cycle for reducing high-frequency components of the error. However, the requisite direct triangular solves are comparatively slow on GPUs. Previous work has demonstrated the advantages of Jacobi iteration as an alternative to direct solution of these systems. Depending on the threshold and fill-level parameters chosen, the factors can be highly nonnormal and Jacobi is unlikely to converge in a low number of iterations. We demonstrate that row scaling can reduce the departure from normality, allowing us to replace the inherently sequential solve with a rapidly converging Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. Further, an ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases, and thus parallel strong-scaling is improved. Our algorithms have been incorporated into hypre, and we demonstrate improved time to solution for linear systems arising in the Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRES<math>\\n <semantics>\\n <mrow>\\n <mo>+</mo>\\n </mrow>\\n <annotation>$$ + $$</annotation>\\n </semantics></math>AMG executes at least five times faster when using iterative triangular solves compared with direct solves on massively parallel GPUs.</p>\",\"PeriodicalId\":50348,\"journal\":{\"name\":\"International Journal for Numerical Methods in Fluids\",\"volume\":\"96 4\",\"pages\":\"537-560\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal for Numerical Methods in Fluids\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/fld.5254\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal for Numerical Methods in Fluids","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/fld.5254","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

在代数多网格 (AMG) V$$ V $$ 周期中,不完整 LU(ILU)平滑器能有效减少误差的高频成分。然而,所需的直接三角求解在 GPU 上速度相对较慢。之前的工作已经证明了雅可比迭代作为直接求解这些系统的替代方法的优势。根据所选阈值和填充级参数的不同,因子可能非常不正常,雅可比不太可能在较少的迭代次数内收敛。我们证明,行缩放可以减少对正态性的偏离,从而用快速收敛的理查森迭代取代固有的顺序求解。除了计算时间更短之外,还有其他一些优势。由于缩放是直接应用于因子的,因此是对全局矩阵的对角块进行局部缩放。此外,随着 MPI 级数的增加,ILUT 舒尔补平滑器能保持恒定的 GMRES 迭代次数,从而改进并行强缩放。我们的算法已被集成到 hypre 中,并演示了 Nalu-Wind 和 PeleLM 压力求解器中出现的线性系统求解时间的改进。对于大型问题,在大规模并行 GPU 上使用三角迭代求解与直接求解相比,GMRES+$$+$AMG 的执行速度至少快五倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Scaled ILU smoothers for Navier–Stokes pressure projection

Scaled ILU smoothers for Navier–Stokes pressure projection

Scaled ILU smoothers for Navier–Stokes pressure projection

Incomplete LU (ILU) smoothers are effective in the algebraic multigrid (AMG) V $$ V $$ -cycle for reducing high-frequency components of the error. However, the requisite direct triangular solves are comparatively slow on GPUs. Previous work has demonstrated the advantages of Jacobi iteration as an alternative to direct solution of these systems. Depending on the threshold and fill-level parameters chosen, the factors can be highly nonnormal and Jacobi is unlikely to converge in a low number of iterations. We demonstrate that row scaling can reduce the departure from normality, allowing us to replace the inherently sequential solve with a rapidly converging Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. Further, an ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases, and thus parallel strong-scaling is improved. Our algorithms have been incorporated into hypre, and we demonstrate improved time to solution for linear systems arising in the Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRES + $$ + $$ AMG executes at least five times faster when using iterative triangular solves compared with direct solves on massively parallel GPUs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal for Numerical Methods in Fluids
International Journal for Numerical Methods in Fluids 物理-计算机:跨学科应用
CiteScore
3.70
自引率
5.60%
发文量
111
审稿时长
8 months
期刊介绍: The International Journal for Numerical Methods in Fluids publishes refereed papers describing significant developments in computational methods that are applicable to scientific and engineering problems in fluid mechanics, fluid dynamics, micro and bio fluidics, and fluid-structure interaction. Numerical methods for solving ancillary equations, such as transport and advection and diffusion, are also relevant. The Editors encourage contributions in the areas of multi-physics, multi-disciplinary and multi-scale problems involving fluid subsystems, verification and validation, uncertainty quantification, and model reduction. Numerical examples that illustrate the described methods or their accuracy are in general expected. Discussions of papers already in print are also considered. However, papers dealing strictly with applications of existing methods or dealing with areas of research that are not deemed to be cutting edge by the Editors will not be considered for review. The journal publishes full-length papers, which should normally be less than 25 journal pages in length. Two-part papers are discouraged unless considered necessary by the Editors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信