Matthew W. Scroggs, Jørgen S. Dokken, C. Richardson, G. N. Wells
{"title":"Construction of Arbitrary Order Finite Element Degree-of-Freedom Maps on Polygonal and Polyhedral Cell Meshes","authors":"Matthew W. Scroggs, Jørgen S. Dokken, C. Richardson, G. N. Wells","doi":"10.1145/3524456","DOIUrl":"https://doi.org/10.1145/3524456","url":null,"abstract":"We develop a method for generating degree-of-freedom maps for arbitrary order Ciarlet-type finite element spaces for any cell shape. The approach is based on the composition of permutations and transformations by cell sub-entity. Current approaches to generating degree-of-freedom maps for arbitrary order problems typically rely on a consistent orientation of cell entities that permits the definition of a common local coordinate system on shared edges and faces. However, while orientation of a mesh is straightforward for simplex cells and is a local operation, it is not a strictly local operation for quadrilateral cells and, in the case of hexahedral cells, not all meshes are orientable. The permutation and transformation approach is developed for a range of element types, including arbitrary degree Lagrange, serendipity, and divergence- and curl-conforming elements, and for a range of cell shapes. The approach is local and can be applied to cells of any shape, including general polytopes and meshes with mixed cell types. A number of examples are presented and the developed approach has been implemented in open-source libraries.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"32 1","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2021-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87908668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithm 1020: Computation of Multi-Degree Tchebycheffian B-Splines","authors":"H. Speleers","doi":"10.1145/3478686","DOIUrl":"https://doi.org/10.1145/3478686","url":null,"abstract":"Multi-degree Tchebycheffian splines are splines with pieces drawn from extended (complete) Tchebycheff spaces, which may differ from interval to interval, and possibly of different dimensions. These are a natural extension of multi-degree polynomial splines. Under quite mild assumptions, they can be represented in terms of a so-called multi-degree Tchebycheffian B-spline (MDTB-spline) basis; such basis possesses all the characterizing properties of the classical polynomial B-spline basis. We present a practical framework to compute MDTB-splines, and provide an object-oriented implementation in Matlab. The implementation supports the construction, differentiation, and visualization of MDTB-splines whose pieces belong to Tchebycheff spaces that are null-spaces of constant-coefficient linear differential operators. The construction relies on an extraction operator that maps local Tchebycheffian Bernstein functions to the MDTB-spline basis of interest.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"29 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78262661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust level-3 BLAS Inverse Iteration from the Hessenberg Matrix","authors":"A. Schwarz","doi":"10.1145/3544789","DOIUrl":"https://doi.org/10.1145/3544789","url":null,"abstract":"Inverse iteration is known to be an effective method for computing eigenvectors corresponding to simple and well-separated eigenvalues. In the non-symmetric case, the solution of shifted Hessenberg systems is a central step. Existing inverse iteration solvers approach the solution of the shifted Hessenberg systems with either RQ or LU factorizations and, once factored, solve the corresponding systems. This approach has limited level-3 BLAS potential since distinct shifts have distinct factorizations. This paper rearranges the RQ approach such that data shared between distinct shifts can be exploited. Thereby the backward substitution with the triangular R factor can be expressed mostly with matrix–matrix multiplications (level-3 BLAS). The resulting algorithm computes eigenvectors in a tiled, overflow-free, and task-parallel fashion. The numerical experiments show that the new algorithm outperforms existing inverse iteration solvers for the computation of both real and complex eigenvectors.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"121 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2021-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85350753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"H2Pack","authors":"Hua Huang, Xin Xing, Edmond Chow","doi":"10.1145/3412850","DOIUrl":"https://doi.org/10.1145/3412850","url":null,"abstract":"Dense kernel matrices represented in H2 matrix format typically require less storage and have faster matrix-vector multiplications than when these matrices are represented in the standard dense format. In this article, we present H2Pack, a high-performance, shared-memory library for constructing and operating with H2 matrix representations for kernel matrices defined by non-oscillatory, translationally invariant kernel functions. Using a hybrid analytic-algebraic compression method called the proxy point method, H2Pack can efficiently construct an H2 matrix representation with linear computational complexity. Storage and matrix-vector multiplication also have linear complexity. H2Pack also introduces the concept of “partially admissible blocks” for H2 matrices to make H2 matrix-vector multiplication mathematically identical to the fast multipole method (FMM) if analytic expansions are used. We optimize H2Pack from both the algorithm and software perspectives. Compared to existing FMM libraries, H2Pack generally has much faster H2 matrix-vector multiplications, since the proxy point method is more effective at producing block low-rank approximations than the analytic methods used in FMM. As a tradeoff, H2 matrix construction in H2Pack is typically more expensive than the setup cost in FMM libraries. Thus, H2Pack is ideal for applications that need a large number of matrix-vector multiplications for a given configuration of data points.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"5 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74250654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Strengths and Limitations of Stretching for Least-squares Problems with Some Dense Rows","authors":"J. Scott, M. Tuma","doi":"10.1145/3412559","DOIUrl":"https://doi.org/10.1145/3412559","url":null,"abstract":"We recently introduced a sparse stretching strategy for handling dense rows that can arise in large-scale linear least-squares problems and make such problems challenging to solve. Sparse stretching is designed to limit the amount of fill within the stretched normal matrix and hence within the subsequent Cholesky factorization. While preliminary results demonstrated that sparse stretching performs significantly better than standard stretching, it has a number of limitations. In this article, we discuss and illustrate these limitations and propose new strategies that are designed to overcome them. Numerical experiments on problems arising from practical applications are used to demonstrate the effectiveness of these new ideas. We consider both direct and preconditioned iterative solvers.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"95 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83355281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithm 1013","authors":"Daisy Arroyo, X. Emery","doi":"10.1145/3421316","DOIUrl":"https://doi.org/10.1145/3421316","url":null,"abstract":"A continuous spectral algorithm and computer routines in the R programming environment that enable the simulation of second-order stationary and intrinsic (i.e., with second-order stationary increments or generalized increments) vector Gaussian random fields in Euclidean spaces are presented. The simulation is obtained by computing a weighted sum of cosine and sine waves, with weights that depend on the matrix-valued spectral density associated with the spatial correlation structure of the random field to simulate. The computational cost is proportional to the number of locations targeted for simulation, below that of sequential, matrix decomposition and discrete spectral algorithms. Also, the implementation is versatile, as there is no restriction on the number of vector components, workspace dimension, number and geometrical configuration of the target locations. The computer routines are illustrated with synthetic examples and statistical testing is proposed to check the normality of the distribution of the simulated random field or of its generalized increments. A by-product of this work is a spectral representation of spherical, cubic, penta, Askey, J-Bessel, Cauchy, Laguerre, hypergeometric, iterated exponential, gamma, and stable covariance models in the d-dimensional Euclidean space.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"44 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90171710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality","authors":"I. F. D. Oliveira, R. Takahashi","doi":"10.1145/3423597","DOIUrl":"https://doi.org/10.1145/3423597","url":null,"abstract":"We identify a class of root-searching methods that surprisingly outperform the bisection method on the average performance while retaining minmax optimality. The improvement on the average applies for any continuous distributional hypothesis. We also pinpoint one specific method within the class and show that under mild initial conditions it can attain an order of convergence of up to 1.618, i.e., the same as the secant method. Hence, we attain both an improved average performance and an improved order of convergence with no cost on the minmax optimality of the bisection method. Numerical experiments show that, on regular functions, the proposed method requires a number of function evaluations similar to current state-of-the-art methods, about 24% to 37% of the evaluations required by the bisection procedure. In problems with non-regular functions, the proposed method performs significantly better than the state-of-the-art, requiring on average 82% of the total evaluations required for the bisection method, while the other methods were outperformed by bisection. In the worst case, while current state-of-the-art commercial solvers required two to three times the number of function evaluations of bisection, our proposed method remained within the minmax bounds of the bisection method.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"29 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79317303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GetFEM","authors":"Y. Renard, K. Poulios","doi":"10.1145/3412849","DOIUrl":"https://doi.org/10.1145/3412849","url":null,"abstract":"This article presents the major mathematical and implementation features of a weak form language (GWFL) for an automated finite-element (FE) solution of partial differential equation systems. The language is implemented in the GetFEM framework and strategic modeling and software architecture choices both for the language and the framework are presented in detail. Moreover, conceptual similarities and differences to existing high-level FE frameworks are discussed. Special attention is given to the concept of a generic transformation mechanism that contributes to the high expressive power of GWFL, allowing to interconnect multiple computational domains or parts of the same domain. Finally, the capabilities of the language for expressing strongly coupled multiphysics problems in a compact and readable form are shown by means of modeling examples.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"12 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88848730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithm 1014","authors":"C. F. Borges","doi":"10.1145/3428446","DOIUrl":"https://doi.org/10.1145/3428446","url":null,"abstract":"We develop fast and accurate algorithms for evaluating √x2+y2 for two floating-point numbers x and y. Library functions that perform this computation are generally named hypot(x,y). We compare five approaches that we will develop in this article to the current resident library function that is delivered with Julia 1.1 and to the code that has been distributed with the C math library for decades. We will investigate the accuracy of our algorithms by simulation.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"111 1","pages":"1 - 12"},"PeriodicalIF":0.0,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81469495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers","authors":"D. J. Gardner, D. Reynolds, C. Woodward, C. Balos","doi":"10.1145/3539801","DOIUrl":"https://doi.org/10.1145/3539801","url":null,"abstract":"In recent years, the SUite of Nonlinear and DIfferential/ALgebraic equation Solvers (SUNDIALS) has been redesigned to better enable the use of application-specific and third-party algebraic solvers and data structures. Throughout this work, we have adhered to specific guiding principles that minimized the impact to current users while providing maximum flexibility for later evolution of solvers and data structures. The redesign was done through the addition of new linear and nonlinear solvers classes, enhancements to the vector class, and the creation of modern Fortran interfaces. The vast majority of this work has been performed “behind-the-scenes,” with minimal changes to the user interface and no reduction in solver capabilities or performance. These changes allow SUNDIALS users to more easily utilize external solver libraries and create highly customized solvers, enabling greater flexibility on extreme-scale, heterogeneous computational architectures.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"1 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90277894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}