CG-Kit: Code Generation Toolkit for performant and maintainable variants of source code applied to Flash-X hydrodynamics simulations

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
{"title":"CG-Kit: Code Generation Toolkit for performant and maintainable variants of source code applied to Flash-X hydrodynamics simulations","authors":"","doi":"10.1016/j.future.2024.107511","DOIUrl":null,"url":null,"abstract":"<div><p>CG-Kit is a new Code Generation tool-Kit that we have developed as a part of the solution for portability and maintainability for multiphysics computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. To efficiently use computing resources on a heterogeneous node, an application must have a map of computation to resources and a mechanism to move the data and computation to the resources according to the map. Most existing performance portability solutions are focussed on abstracting the expression of computations so that a unified source code can be specialized to run on different resources. However, such an approach is insufficient for a code like Flash-X, which has a multitude of code components that can be assembled in various permutations and combinations to form different instances of applications. Similar challenges apply to any code that has composability, where a single specified way of apportioning work among devices may not be optimal. Additionally, use cases arise where the optimal control flow of computation may differ for different devices while the underlying numerics remain identical. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed and shared memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. All of these challenges demand that scientific software developers apply existing knowledge about domain applications, algorithms, and computing platforms to determine custom abstractions and granularity for code generation. There is a critical lack of tools to tackle those problems. CG-Kit is designed to fill this gap by providing a user with the ability to express their desired control flow and computation-to-resource map in the form a pseudocode-like recipe. It consists of standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability toolchains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. They are agnostic to the programming language of the source code targeted for code generation. We demonstrate the capabilities of the toolkit with two examples, first, multithreaded variants of the basic AXPY operation, and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004758","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

CG-Kit is a new Code Generation tool-Kit that we have developed as a part of the solution for portability and maintainability for multiphysics computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. To efficiently use computing resources on a heterogeneous node, an application must have a map of computation to resources and a mechanism to move the data and computation to the resources according to the map. Most existing performance portability solutions are focussed on abstracting the expression of computations so that a unified source code can be specialized to run on different resources. However, such an approach is insufficient for a code like Flash-X, which has a multitude of code components that can be assembled in various permutations and combinations to form different instances of applications. Similar challenges apply to any code that has composability, where a single specified way of apportioning work among devices may not be optimal. Additionally, use cases arise where the optimal control flow of computation may differ for different devices while the underlying numerics remain identical. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed and shared memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. All of these challenges demand that scientific software developers apply existing knowledge about domain applications, algorithms, and computing platforms to determine custom abstractions and granularity for code generation. There is a critical lack of tools to tackle those problems. CG-Kit is designed to fill this gap by providing a user with the ability to express their desired control flow and computation-to-resource map in the form a pseudocode-like recipe. It consists of standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability toolchains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. They are agnostic to the programming language of the source code targeted for code generation. We demonstrate the capabilities of the toolkit with two examples, first, multithreaded variants of the basic AXPY operation, and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes.

CG-Kit:代码生成工具包,用于生成适用于 Flash-X 流体动力学模拟的高性能、可维护的源代码变体
CG-Kit 是我们新开发的代码生成工具包,是多物理场计算应用可移植性和可维护性解决方案的一部分。CG-Kit 的开发源于高性能计算平台的不断变化以及特定大规模多物理场应用算法复杂性所带来的迫切需求:Flash-X。为了有效利用异构节点上的计算资源,应用程序必须拥有计算到资源的映射,以及根据映射将数据和计算移动到资源的机制。现有的性能可移植性解决方案大多侧重于抽象计算的表达方式,从而使统一的源代码可以专门用于在不同资源上运行。然而,这种方法对于 Flash-X 这样的代码来说是不够的,因为 Flash-X 有许多代码组件,可以通过各种排列和组合形成不同的应用实例。类似的挑战也适用于任何具有可组合性的代码,在这种情况下,在设备间分配工作的单一指定方式可能并非最佳。此外,在一些使用案例中,不同设备的最佳计算控制流可能会有所不同,而底层数值却保持一致。这种组合带来了独特的挑战,包括处理现有的大量 Fortran 和/或 C/C++ 代码库、将代码细分为支持各种物理和数值方法的大量单元、针对分布式和共享内存系统及加速器设备的不同并行化技术,以及需要并行算法变体共存的计算平台的异质性。所有这些挑战都要求科学软件开发人员应用有关领域应用、算法和计算平台的现有知识来确定代码生成的自定义抽象和粒度。解决这些问题的工具非常缺乏。CG-Kit 就是为了填补这一空白而设计的,它为用户提供了以伪代码形式表达其所需控制流和计算到资源映射的能力。它由独立的工具组成,这些工具可以组合成非常具体的、我们认为非常有效的可移植性和可维护性工具链。在此,我们将介绍新工具的设计:参数化源代码树、控制流图和配方。这些工具是用 Python 实现的。它们与代码生成目标源代码的编程语言无关。我们用两个例子演示了工具包的功能,首先是 AXPY 基本操作的多线程变体,其次是 Flash-X 流体力学求解器 Spark 中并行算法的变体,该求解器可在块结构自适应网格上运行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信