{"title":"CG-Kit: Code Generation Toolkit for performant and maintainable variants of source code applied to Flash-X hydrodynamics simulations","authors":"","doi":"10.1016/j.future.2024.107511","DOIUrl":null,"url":null,"abstract":"<div><p>CG-Kit is a new Code Generation tool-Kit that we have developed as a part of the solution for portability and maintainability for multiphysics computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. To efficiently use computing resources on a heterogeneous node, an application must have a map of computation to resources and a mechanism to move the data and computation to the resources according to the map. Most existing performance portability solutions are focussed on abstracting the expression of computations so that a unified source code can be specialized to run on different resources. However, such an approach is insufficient for a code like Flash-X, which has a multitude of code components that can be assembled in various permutations and combinations to form different instances of applications. Similar challenges apply to any code that has composability, where a single specified way of apportioning work among devices may not be optimal. Additionally, use cases arise where the optimal control flow of computation may differ for different devices while the underlying numerics remain identical. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed and shared memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. All of these challenges demand that scientific software developers apply existing knowledge about domain applications, algorithms, and computing platforms to determine custom abstractions and granularity for code generation. There is a critical lack of tools to tackle those problems. CG-Kit is designed to fill this gap by providing a user with the ability to express their desired control flow and computation-to-resource map in the form a pseudocode-like recipe. It consists of standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability toolchains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. They are agnostic to the programming language of the source code targeted for code generation. We demonstrate the capabilities of the toolkit with two examples, first, multithreaded variants of the basic AXPY operation, and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004758","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
CG-Kit is a new Code Generation tool-Kit that we have developed as a part of the solution for portability and maintainability for multiphysics computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. To efficiently use computing resources on a heterogeneous node, an application must have a map of computation to resources and a mechanism to move the data and computation to the resources according to the map. Most existing performance portability solutions are focussed on abstracting the expression of computations so that a unified source code can be specialized to run on different resources. However, such an approach is insufficient for a code like Flash-X, which has a multitude of code components that can be assembled in various permutations and combinations to form different instances of applications. Similar challenges apply to any code that has composability, where a single specified way of apportioning work among devices may not be optimal. Additionally, use cases arise where the optimal control flow of computation may differ for different devices while the underlying numerics remain identical. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed and shared memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. All of these challenges demand that scientific software developers apply existing knowledge about domain applications, algorithms, and computing platforms to determine custom abstractions and granularity for code generation. There is a critical lack of tools to tackle those problems. CG-Kit is designed to fill this gap by providing a user with the ability to express their desired control flow and computation-to-resource map in the form a pseudocode-like recipe. It consists of standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability toolchains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. They are agnostic to the programming language of the source code targeted for code generation. We demonstrate the capabilities of the toolkit with two examples, first, multithreaded variants of the basic AXPY operation, and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.