Stencils in Scientific Computations

Proceedings of the Second Workshop on Optimizing Stencil Computations Pub Date : 2014-10-20 DOI:10.1145/2686745.2686756

A. Dubey

{"title":"Stencils in Scientific Computations","authors":"A. Dubey","doi":"10.1145/2686745.2686756","DOIUrl":null,"url":null,"abstract":"Stencils occur in many areas, but they are ubiquitous in scientific computing. They range from the simple Jacobi iterations to the extremely complex ones used in the solution of highly nonlinear partial differential equations (PDE). High level programming languages typically used in implementation of scientific software, by not providing explicit support for stencils, force each implementation to make choices about expressing its specifics such as dimensionality, data layout, order of access and order of operations. These choices often hide the opportunity for optimizations from the compilers. Therehave been attempts to provide abstractions for simpler stencils, and they have met with success in some areas, but multiphysics scientific applications present challenges that cannot be met by simple stencil abstractions. The applications may have hierarchy, or non-uniformity, or both in their discretizations which cannot be expressed by stencils describing uniform discretizations. The physics operators being applied maybe non-linear which would demand composability of stencils. As the order of the solution method increases, the size and the reach of stencil also increases, and there may be conditions that imply the application of the stencil to an arbitrary subset of the discretized points. And finally, if there are multiple steps involved in an update, intermediate results need to be managed. AMR Shift Calculus, (Phil Colella and Brian Van Straalen 2014), provides a generalized abstraction that addresses many of these concerns. It provides a means of expressing stencil computations in the form of a collection of shift operations combined with associated weights, that can be applied to a specified collection of discretized points. The shift calculus also addresses the hierarchy in the discretization, and defines operators on stencils that allow more complex stencils to be composed from simpler ones. Because the shift calculus makes it possible to express the computation concisely and precisely, it gets around the problem of false dependencies. Additionally, the composability of the stencil operators exposes possibilities of loop or even function fusion, and the granularity for holding intermediate values to the compiler for better optimization opportunities. The included slide presentation is organized in five sections. The first section gives examples of discretization from simple Poisson to complex compressible Navier-Stokes (CNS) equations and addresses thelevel of abstraction needed to express the computations on these discretizations. The second section outlines several challenges that are unique to scientific applications, and the ways in which many abstractions that have proved useful elsewhere fail to work with scientific computing. The third section goes on to describe the AMR shift calculus with emphasis on features that are typically not found in other approaches to stencils based abstractions, but are necessary for the solving complex PDE's in scientific computing. The fourth section provides an example of applying one aspect of the shift calculus to a CNS solver implemented in F90. The example replaces loop-nests that explicitly implement the first and second derivative operators being applied to various field variables in the solver. It not only collapses the line-count in the code, but also removes unnecessary specificity of the order of floating point operations in the implementation. Finally, the fifth section talks about the opportunities that are made available to compilers for optimizations by expressing the computations in a high level abstraction.","PeriodicalId":367066,"journal":{"name":"Proceedings of the Second Workshop on Optimizing Stencil Computations","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop on Optimizing Stencil Computations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2686745.2686756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Stencils occur in many areas, but they are ubiquitous in scientific computing. They range from the simple Jacobi iterations to the extremely complex ones used in the solution of highly nonlinear partial differential equations (PDE). High level programming languages typically used in implementation of scientific software, by not providing explicit support for stencils, force each implementation to make choices about expressing its specifics such as dimensionality, data layout, order of access and order of operations. These choices often hide the opportunity for optimizations from the compilers. Therehave been attempts to provide abstractions for simpler stencils, and they have met with success in some areas, but multiphysics scientific applications present challenges that cannot be met by simple stencil abstractions. The applications may have hierarchy, or non-uniformity, or both in their discretizations which cannot be expressed by stencils describing uniform discretizations. The physics operators being applied maybe non-linear which would demand composability of stencils. As the order of the solution method increases, the size and the reach of stencil also increases, and there may be conditions that imply the application of the stencil to an arbitrary subset of the discretized points. And finally, if there are multiple steps involved in an update, intermediate results need to be managed. AMR Shift Calculus, (Phil Colella and Brian Van Straalen 2014), provides a generalized abstraction that addresses many of these concerns. It provides a means of expressing stencil computations in the form of a collection of shift operations combined with associated weights, that can be applied to a specified collection of discretized points. The shift calculus also addresses the hierarchy in the discretization, and defines operators on stencils that allow more complex stencils to be composed from simpler ones. Because the shift calculus makes it possible to express the computation concisely and precisely, it gets around the problem of false dependencies. Additionally, the composability of the stencil operators exposes possibilities of loop or even function fusion, and the granularity for holding intermediate values to the compiler for better optimization opportunities. The included slide presentation is organized in five sections. The first section gives examples of discretization from simple Poisson to complex compressible Navier-Stokes (CNS) equations and addresses thelevel of abstraction needed to express the computations on these discretizations. The second section outlines several challenges that are unique to scientific applications, and the ways in which many abstractions that have proved useful elsewhere fail to work with scientific computing. The third section goes on to describe the AMR shift calculus with emphasis on features that are typically not found in other approaches to stencils based abstractions, but are necessary for the solving complex PDE's in scientific computing. The fourth section provides an example of applying one aspect of the shift calculus to a CNS solver implemented in F90. The example replaces loop-nests that explicitly implement the first and second derivative operators being applied to various field variables in the solver. It not only collapses the line-count in the code, but also removes unnecessary specificity of the order of floating point operations in the implementation. Finally, the fifth section talks about the opportunities that are made available to compilers for optimizations by expressing the computations in a high level abstraction.

查看原文本刊更多论文

科学计算中的模板

模板出现在许多领域，但它们在科学计算中无处不在。它们的范围从简单的雅可比迭代到用于求解高度非线性偏微分方程(PDE)的极其复杂的雅可比迭代。通常用于科学软件实现的高级编程语言，由于不提供对模板的显式支持，迫使每个实现在表达其细节(如维度、数据布局、访问顺序和操作顺序)方面做出选择。这些选择通常对编译器隐藏了优化的机会。已经有人尝试为更简单的模板提供抽象，并且在某些领域取得了成功，但是多物理场科学应用程序提出了简单的模板抽象无法满足的挑战。应用程序在其离散化中可能具有层次性或非均匀性，或者两者兼而有之，这些离散化不能用描述均匀离散化的模板来表示。所应用的物理算子可能是非线性的，这将要求模板的可组合性。随着求解方法阶数的增加，模板的尺寸和范围也会增加，并且可能存在将模板应用于任意离散点子集的条件。最后，如果更新涉及多个步骤，则需要管理中间结果。AMR移位演算(Phil Colella and Brian Van Straalen 2014)提供了一个广义的抽象，解决了许多这些问题。它提供了一种以移位操作集合与相关权重相结合的形式表示模板计算的方法，这些移位操作集合可以应用于指定的离散点集合。移位演算还解决了离散化中的层次结构，并定义了模板上的运算符，允许更复杂的模板由更简单的模板组成。由于移位演算可以简洁而精确地表达计算结果，因此可以避免错误依赖的问题。此外，模板操作符的可组合性暴露了循环甚至函数融合的可能性，以及为编译器保存中间值的粒度，以获得更好的优化机会。所包含的幻灯片演示分为五个部分。第一部分给出了从简单泊松到复杂可压缩纳维-斯托克斯(CNS)方程离散化的例子，并说明了在这些离散化上表达计算所需的抽象水平。第二部分概述了科学应用程序特有的几个挑战，以及许多在其他地方证明有用的抽象不能用于科学计算的方式。第三部分继续描述AMR移位演算，重点是在基于模板的抽象的其他方法中通常找不到的特征，但对于解决科学计算中的复杂PDE是必要的。第四部分提供了一个例子，将移位演算的一个方面应用于F90中实现的CNS求解器。该示例取代了显式实现应用于求解器中各种字段变量的一阶和二阶导数算子的循环巢。它不仅减少了代码中的行数，而且还消除了实现中浮点操作顺序的不必要的特异性。最后，第五部分将讨论通过在高级抽象中表达计算，为编译器提供的优化机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Second Workshop on Optimizing Stencil Computations

自引率

0.00%

发文量