ACM Transactions on Mathematical Software (TOMS)最新文献

筛选
英文 中文
Configurable Open-source Data Structure for Distributed Conforming Unstructured Homogeneous Meshes with GPU Support 支持GPU的分布式非结构化同构网格的可配置开源数据结构
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-09-10 DOI: 10.1145/3536164
Jakub Klinkovský, T. Oberhuber, R. Fučík, Vítezslav Zabka
{"title":"Configurable Open-source Data Structure for Distributed Conforming Unstructured Homogeneous Meshes with GPU Support","authors":"Jakub Klinkovský, T. Oberhuber, R. Fučík, Vítezslav Zabka","doi":"10.1145/3536164","DOIUrl":"https://doi.org/10.1145/3536164","url":null,"abstract":"A general multi-purpose data structure for an efficient representation of conforming unstructured homogeneous meshes for scientific computations on CPU and GPU-based systems is presented. The data structure is provided as open-source software as part of the TNL library (https://tnl-project.org/). The abstract representation supports almost any cell shape and common 2D quadrilateral, 3D hexahedron and arbitrarily dimensional simplex shapes are currently built into the library. The implementation is highly configurable via templates of the C++ language, which allows avoiding the storage of unnecessary dynamic data. The internal memory layout is based on state-of-the-art sparse matrix storage formats, which are optimized for different hardware architectures in order to provide high-performance computations. The proposed data structure is also suitable for meshes decomposed into several subdomains and distributed computing using the Message Passing Interface (MPI). The efficiency of the implemented data structure on CPU and GPU hardware architectures is demonstrated on several benchmark problems and a comparison with another library. Its applicability to advanced numerical methods is demonstrated with an example problem of two-phase flow in porous media using a numerical scheme based on the mixed-hybrid finite element method (MHFEM). We show GPU speed-ups that rise above 20 in 2D and 50 in 3D when compared to sequential CPU computations, and above 2 in 2D and 9 in 3D when compared to 12-threaded CPU computations.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"10 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2022-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86142597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Algorithm 1028: VTMOP: Solver for Blackbox Multiobjective Optimization Problems 算法1028:VTMOP:求解黑盒多目标优化问题
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-05-27 DOI: 10.1145/3529258
Tyler H. Chang, L.T. Watson, Jeffrey Larson, N. Neveu, W. Thacker, Shubhangi G. Deshpande, T. Lux
{"title":"Algorithm 1028: VTMOP: Solver for Blackbox Multiobjective Optimization Problems","authors":"Tyler H. Chang, L.T. Watson, Jeffrey Larson, N. Neveu, W. Thacker, Shubhangi G. Deshpande, T. Lux","doi":"10.1145/3529258","DOIUrl":"https://doi.org/10.1145/3529258","url":null,"abstract":"VTMOP is a Fortran 2008 software package containing two Fortran modules for solving computationally expensive bound-constrained blackbox multiobjective optimization problems. VTMOP implements the algorithm of [32], which handles two or more objectives, does not require any derivatives, and produces well-distributed points over the Pareto front. The first module contains a general framework for solving multiobjective optimization problems by combining response surface methodology, trust region methodology, and an adaptive weighting scheme. The second module features a driver subroutine that implements this framework when the objective functions can be wrapped as a Fortran subroutine. Support is provided for both serial and parallel execution paradigms, and VTMOP is demonstrated on several test problems as well as one real-world problem in the area of particle accelerator optimization.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"27 1","pages":"1 - 34"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82621587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On Memory Traffic and Optimisations for Low-order Finite Element Assembly Algorithms on Multi-core CPUs 多核cpu上低阶有限元装配算法的内存流量与优化
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-03-04 DOI: 10.1145/3503925
James D. Trotter, Xing Cai, S. Funke
{"title":"On Memory Traffic and Optimisations for Low-order Finite Element Assembly Algorithms on Multi-core CPUs","authors":"James D. Trotter, Xing Cai, S. Funke","doi":"10.1145/3503925","DOIUrl":"https://doi.org/10.1145/3503925","url":null,"abstract":"Motivated by the wish to understand the achievable performance of finite element assembly on unstructured computational meshes, we dissect the standard cellwise assembly algorithm into four kernels, two of which are dominated by irregular memory traffic. Several optimisation schemes are studied together with associated lower and upper bounds on the estimated memory traffic volume. Apart from properly reordering the mesh entities, the two most significant optimisations include adopting a lookup table in adding element matrices or vectors to their global counterparts, and using a row-wise assembly algorithm for multi-threaded parallelisation. Rigorous benchmarking shows that, due to the various optimisations, the actual volumes of memory traffic are in many cases very close to the estimated lower bounds. These results confirm the effectiveness of the optimisations, while also providing a recipe for developing efficient software for finite element assembly.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"89 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73638974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Algorithm 1021: SPEX Left LU, Exactly Solving Sparse Linear Systems via a Sparse Left-looking Integer-preserving LU Factorization 算法1021:SPEX左LU,通过稀疏左查找保整LU分解精确求解稀疏线性系统
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-03-04 DOI: 10.1145/3519024
Christopher Lourenco, Jinhao Chen, Erick Moreno-Centeno, T. Davis
{"title":"Algorithm 1021: SPEX Left LU, Exactly Solving Sparse Linear Systems via a Sparse Left-looking Integer-preserving LU Factorization","authors":"Christopher Lourenco, Jinhao Chen, Erick Moreno-Centeno, T. Davis","doi":"10.1145/3519024","DOIUrl":"https://doi.org/10.1145/3519024","url":null,"abstract":"SPEX Left LU is a software package for exactly solving unsymmetric sparse linear systems. As a component of the sparse exact (SPEX) software package, SPEX Left LU can be applied to any input matrix, A, whose entries are integral, rational, or decimal, and provides a solution to the system ( Ax = b ) , which is either exact or accurate to user-specified precision. SPEX Left LU preorders the matrix A with a user-specified fill-reducing ordering and computes a left-looking LU factorization with the special property that each operation used to compute the L and U matrices is integral. Notable additional applications of this package include benchmarking the stability and accuracy of state-of-the-art linear solvers and determining whether singular-to-double-precision matrices are indeed singular. Computationally, this article evaluates the impact of several novel pivoting schemes in exact arithmetic, benchmarks the exact iterative solvers within Linbox, and benchmarks the accuracy of MATLAB sparse backslash. Most importantly, it is shown that SPEX Left LU outperforms the exact iterative solvers in run time on easy instances and in stability as the iterative solver fails on a sizeable subset of the tested (both easy and hard) instances. The SPEX Left LU package is written in ANSI C, comes with a MATLAB interface, and is distributed via GitHub, as a component of the SPEX software package, and as a component of SuiteSparse.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"14 1","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82138007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Provably Robust Algorithm for Triangle-triangle Intersections in Floating-point Arithmetic 浮点运算中三角形-三角形相交的可证明鲁棒算法
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-03-04 DOI: 10.1145/3513264
Conor Mccoid, M. Gander
{"title":"A Provably Robust Algorithm for Triangle-triangle Intersections in Floating-point Arithmetic","authors":"Conor Mccoid, M. Gander","doi":"10.1145/3513264","DOIUrl":"https://doi.org/10.1145/3513264","url":null,"abstract":"Motivated by the unexpected failure of the triangle intersection component of the Projection Algorithm for Nonmatching Grids (PANG), this article provides a robust version with proof of backward stability. The new triangle intersection algorithm ensures consistency and parsimony across three types of calculations. The set of intersections produced by the algorithm, called representations, is shown to match the set of geometric intersections, called models. The article concludes with a comparison between the old and new intersection algorithms for PANG using an example found to reliably generate failures in the former.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"98 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80549268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Exploiting Problem Structure in Derivative Free Optimization 利用无导数优化中的问题结构
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-16 DOI: 10.1145/3474054
M. Porcelli, P. Toint
{"title":"Exploiting Problem Structure in Derivative Free Optimization","authors":"M. Porcelli, P. Toint","doi":"10.1145/3474054","DOIUrl":"https://doi.org/10.1145/3474054","url":null,"abstract":"A structured version of derivative-free random pattern search optimization algorithms is introduced, which is able to exploit coordinate partially separable structure (typically associated with sparsity) often present in unconstrained and bound-constrained optimization problems. This technique improves performance by orders of magnitude and makes it possible to solve large problems that otherwise are totally intractable by other derivative-free methods. A library of interpolation-based modelling tools is also described, which can be associated with the structured or unstructured versions of the initial pattern search algorithm. The use of the library further enhances performance, especially when associated with structure. The significant gains in performance associated with these two techniques are illustrated using a new freely-available release of the Brute Force Optimizer (BFO) package firstly introduced in [Porcelli and Toint 2017], which incorporates them. An interesting conclusion of the numerical results presented is that providing global structural information on a problem can result in significantly less evaluations of the objective function than attempting to building local Taylor-like models.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"128 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77698048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Reproduced Computational Results Report for “Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing” “银杏:用于高性能计算的现代线性算子代数框架”的再现计算结果报告
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-16 DOI: 10.1145/3480936
C. Balos
{"title":"Reproduced Computational Results Report for “Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing”","authors":"C. Balos","doi":"10.1145/3480936","DOIUrl":"https://doi.org/10.1145/3480936","url":null,"abstract":"The article titled “Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing” by Anzt et al. presents a modern, linear operator centric, C++ library for sparse linear algebra. Experimental results in the article demonstrate that Ginkgo is a flexible and user-friendly framework capable of achieving high-performance on state-of-the-art GPU architectures. In this report, the Ginkgo library is installed and a subset of the experimental results are reproduced. Specifically, the experiment that shows the achieved memory bandwidth of the Ginkgo Krylov linear solvers on NVIDIA A100 and AMD MI100 GPUs is redone and the results are compared to what presented in the published article. Upon completion of the comparison, the published results are deemed reproducible.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"149 1","pages":"1 - 7"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85367126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Computational Study of Using Black-box QR Solvers for Large-scale Sparse-dense Linear Least Squares Problems 大规模稀疏密集线性最小二乘问题黑箱QR解的计算研究
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-16 DOI: 10.1145/3494527
J. Scott, M. Tuma
{"title":"A Computational Study of Using Black-box QR Solvers for Large-scale Sparse-dense Linear Least Squares Problems","authors":"J. Scott, M. Tuma","doi":"10.1145/3494527","DOIUrl":"https://doi.org/10.1145/3494527","url":null,"abstract":"Large-scale overdetermined linear least squares problems arise in many practical applications. One popular solution method is based on the backward stable QR factorization of the system matrix A. This article focuses on sparse-dense least squares problems in which A is sparse except from a small number of rows that are considered dense. For large-scale problems, the direct application of a QR solver either fails because of insufficient memory or is unacceptably slow. We study several solution approaches based on using a sparse QR solver without modification, focussing on the case that the sparse part of A is rank deficient. We discuss partial matrix stretching and regularization and propose extending the augmented system formulation with iterative refinement for sparse problems to sparse-dense problems, optionally incorporating multi-precision arithmetic. In summary, our computational study shows that, before applying a black-box QR factorization, a check should be made for rows that are classified as dense and, if such rows are identified, then A should be split into sparse and dense blocks; a number of ways to use a black-box QR factorization to exploit this splitting are possible, with no single method found to be the best in all cases.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"30 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91194345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Algorithm 1018: FaVeST—Fast Vector Spherical Harmonic Transforms 算法1018:最快的矢量球谐变换
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-09-28 DOI: 10.1145/3458470
Q. L. Le Gia, Ming Li, Yu Guang Wang
{"title":"Algorithm 1018: FaVeST—Fast Vector Spherical Harmonic Transforms","authors":"Q. L. Le Gia, Ming Li, Yu Guang Wang","doi":"10.1145/3458470","DOIUrl":"https://doi.org/10.1145/3458470","url":null,"abstract":"Vector spherical harmonics on the unit sphere of ℝ3 have broad applications in geophysics, quantum mechanics, and astrophysics. In the representation of a tangent vector field, one needs to evaluate the expansion and the Fourier coefficients of vector spherical harmonics. In this article, we develop fast algorithms (FaVeST) for vector spherical harmonic transforms on these evaluations. The forward FaVeST evaluates the Fourier coefficients and has a computational cost proportional to N log √N for N number of evaluation points. The adjoint FaVeST, which evaluates a linear combination of vector spherical harmonics with a degree up to ⊡M for M evaluation points, has cost proportional to M log √M. Numerical examples of simulated tangent fields illustrate the accuracy, efficiency, and stability of FaVeST.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"20 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80774026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Corrigendum: Remark on Algorithm 723: Fresnel Integrals 勘误:关于算法723:菲涅耳积分的注释
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-09-28 DOI: 10.1145/3452336
W. Van Snyder
{"title":"Corrigendum: Remark on Algorithm 723: Fresnel Integrals","authors":"W. Van Snyder","doi":"10.1145/3452336","DOIUrl":"https://doi.org/10.1145/3452336","url":null,"abstract":"There are mistakes and typographical errors in Remark on Algorithm 723: Fresnel Integrals, which appeared in ACM Transactions on Mathematical Software 22, 4 (December 1996). This remark corrects those errors. The software provided to Collected Algorithms of the ACM was correct.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"24 1","pages":"1 - 1"},"PeriodicalIF":0.0,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82086155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信