ACM Transactions on Mathematical Software (TOMS)最新文献

筛选
英文 中文
Algorithm 995 算法995
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-07-18 DOI: 10.1145/3301321
Juliette Pardue, Andrey N. Chernikov
{"title":"Algorithm 995","authors":"Juliette Pardue, Andrey N. Chernikov","doi":"10.1145/3301321","DOIUrl":"https://doi.org/10.1145/3301321","url":null,"abstract":"A bottom-up approach to parallel anisotropic mesh generation is presented by building a mesh generator starting from the basic operations of vertex insertion and Delaunay triangles. Applications focusing on high-lift design or dynamic stall, or numerical methods and modeling test cases, still focus on two-dimensional domains. This automated parallel mesh generation approach can generate high-fidelity unstructured meshes with anisotropic boundary layers for use in the computational fluid dynamics field. The anisotropy requirement adds a level of complexity to a parallel meshing algorithm by making computation depend on the local alignment of elements, which in turn is dictated by geometric boundaries and the density functions— one-dimensional spacing functions generated from an exponential distribution. This approach yields computational savings in mesh generation and flow solution through well-shaped anisotropic triangles instead of isotropic triangles. The validity of the meshes is shown through solution characteristic comparisons to verified reference solutions. A 79% parallel weak scaling efficiency on 1,024 distributed memory nodes, and a 72% parallel efficiency over the fastest sequential isotropic mesh generator on 512 distributed memory nodes, is shown through numerical experiments.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"33 1","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84554039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm 994 算法994
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-06-05 DOI: 10.1145/3302389
F. Hernando, Francisco D. Igual, G. Quintana-Ortí
{"title":"Algorithm 994","authors":"F. Hernando, Francisco D. Igual, G. Quintana-Ortí","doi":"10.1145/3302389","DOIUrl":"https://doi.org/10.1145/3302389","url":null,"abstract":"The minimum distance of an error-correcting code is an important concept in information theory. Hence, computing the minimum distance of a code with a minimum computational cost is crucial to many problems in this area. In this article, we present and assess a family of implementations of both the brute-force algorithm and the Brouwer-Zimmermann algorithm for computing the minimum distance of a random linear code over F2 that are faster than current implementations, both in the commercial and public domain. In addition to the basic sequential implementations, we present parallel and vectorized implementations that produce high performances on modern architectures. The attained performance results show the benefits of the developed optimized algorithms, which obtain remarkable improvements compared with state-of-the-art implementations widely used nowadays.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"21 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2019-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77977526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
CGPOPS CGPOPS
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-05-28 DOI: 10.1145/3390463
Yunus M. Agamawi, Anil V. Rao
{"title":"CGPOPS","authors":"Yunus M. Agamawi, Anil V. Rao","doi":"10.1145/3390463","DOIUrl":"https://doi.org/10.1145/3390463","url":null,"abstract":"A general-purpose C++ software program called CGPOPS is described for solving multiple-phase optimal control problems using adaptive direct orthogonal collocation methods. The software employs a Legendre-Gauss-Radau direct orthogonal collocation method to transcribe the continuous optimal control problem into a large sparse nonlinear programming problem (NLP). A class of hp mesh refinement methods are implemented that determine the number of mesh intervals and the degree of the approximating polynomial within each mesh interval to achieve a specified accuracy tolerance. The software is interfaced with the open source Newton NLP solver IPOPT. All derivatives required by the NLP solver are computed via central finite differencing, bicomplex-step derivative approximations, hyper-dual derivative approximations, or automatic differentiation. The key components of the software are described in detail, and the utility of the software is demonstrated on five optimal control problems of varying complexity. The software described in this article provides researchers a transitional platform to solve a wide variety of complex constrained optimal control problems.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"57 1","pages":"1 - 38"},"PeriodicalIF":0.0,"publicationDate":"2019-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85956700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
An Algorithm for the Complete Solution of the Quartic Eigenvalue Problem 四次特征值问题完全解的一种算法
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-05-16 DOI: 10.1145/3494528
Z. Drmač, Ivana Šain Glibić
{"title":"An Algorithm for the Complete Solution of the Quartic Eigenvalue Problem","authors":"Z. Drmač, Ivana Šain Glibić","doi":"10.1145/3494528","DOIUrl":"https://doi.org/10.1145/3494528","url":null,"abstract":"The quartic eigenvalue problem (λ4A+λ3B+λ2C+λD+E)x = 0 naturally arises in a plethora of applications, such as when solving the Orr–Sommerfeld equation in the stability analysis of the Poiseuille flow, in theoretical analysis and experimental design of locally resonant phononic plates, modeling a robot with electric motors in the joints, calibration of catadioptric vision system, or, for example, computation of the guided and leaky modes of a planar waveguide. This article proposes a new numerical method for the full solution (all eigenvalues and all left and right eigenvectors) that, starting with a suitable linearization, uses an initial, structure-preserving reduction designed to reveal and deflate a certain number of zero and infinite eigenvalues before the final linearization is forwarded to the QZ algorithm. The backward error in the reduction phase is bounded column wise in each coefficient matrix, which is advantageous if the coefficient matrices are graded. Numerical examples show that the proposed algorithm is capable of computing the eigenpairs with small residuals, and that it is competitive with the available state-of-the-art methods.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"27 1","pages":"1 - 34"},"PeriodicalIF":0.0,"publicationDate":"2019-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82866952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Algorithm 993 算法993
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-05-03 DOI: 10.1145/3291041
P. Fackler
{"title":"Algorithm 993","authors":"P. Fackler","doi":"10.1145/3291041","DOIUrl":"https://doi.org/10.1145/3291041","url":null,"abstract":"An algorithm for multiplying a chain of Kronecker products by a matrix is described. The algorithm does not require that the Kronecker chain actually be computed and the main computational work is a series of matrix-matrix multiplications. Use of the algorithm can lead to substantial savings in both memory requirements and computational speed. Although similar algorithms have been described before, this article makes two novel contributions. First, it shows how shuffling of data can be (largely) avoided. Second, it provides a simple method to determine the optimal ordering of the workflow.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"25 1","pages":"1 - 9"},"PeriodicalIF":0.0,"publicationDate":"2019-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77982615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs gpu上非常小矩阵大小的批处理三角形密集线性代数核
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-05-03 DOI: 10.1145/3267101
A. Charara, D. Keyes, H. Ltaief
{"title":"Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs","authors":"A. Charara, D. Keyes, H. Ltaief","doi":"10.1145/3267101","DOIUrl":"https://doi.org/10.1145/3267101","url":null,"abstract":"Batched dense linear algebra kernels are becoming ubiquitous in scientific applications, ranging from tensor contractions in deep learning to data compression in hierarchical low-rank matrix approximation. Within a single API call, these kernels are capable of simultaneously launching up to thousands of similar matrix computations, removing the expensive overhead of multiple API calls while increasing the occupancy of the underlying hardware. A challenge is that for the existing hardware landscape (x86, GPUs, etc.), only a subset of the required batched operations is implemented by the vendors, with limited support for very small problem sizes. We describe the design and performance of a new class of batched triangular dense linear algebra kernels on very small data sizes (up to 256) using single and multiple GPUs. By deploying recursive formulations, stressing the register usage, maintaining data locality, reducing threads synchronization, and fusing successive kernel calls, the new batched kernels outperform existing state-of-the-art implementations.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"37 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2019-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85497361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Client-side Computational Optimization 客户端计算优化
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-04-29 DOI: 10.1145/3309549
V. Maniezzo, Marco A. Boschetti, A. Carbonaro, M. Marzolla, F. Strappaveccia
{"title":"Client-side Computational Optimization","authors":"V. Maniezzo, Marco A. Boschetti, A. Carbonaro, M. Marzolla, F. Strappaveccia","doi":"10.1145/3309549","DOIUrl":"https://doi.org/10.1145/3309549","url":null,"abstract":"Mobile platforms have matured to a point where they can provide the infrastructure required to support sophisticated optimization codes. This opens the possibility to envisage new interest for distributed application codes and the opportunity to intensify research on optimization algorithms requiring limited computational resources, as provided by mobile platforms. In this article, we report on some exploratory experience in this area. We illustrate some practical, real-world cases where running optimization programs on mobile or embedded devices can be useful, with particular emphasis on matheuristics approaches. Then, we discuss a practical use case involving the feasibility version of the generalized assignment problem (GAP). We present a JavaScript implementation of a GAP solver that can be executed inside an ordinary browser supporting ECMAScript. We tested the code on different smartphones of varying age and power, as well as on desktop PCs and other embedded devices. Our experiments confirm the viability of mobile devices for computational intensive tasks.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"42 1","pages":"1 - 16"},"PeriodicalIF":0.0,"publicationDate":"2019-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85481682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
ChASE 追逐
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-04-26 DOI: 10.1145/3313828
Jan Winkelmann, P. Springer, E. D. Napoli
{"title":"ChASE","authors":"Jan Winkelmann, P. Springer, E. D. Napoli","doi":"10.1145/3313828","DOIUrl":"https://doi.org/10.1145/3313828","url":null,"abstract":"Solving dense Hermitian eigenproblems arranged in a sequence with direct solvers fails to take advantage of those spectral properties that are pertinent to the entire sequence and not just to the single problem. When such features take the form of correlations between the eigenvectors of consecutive problems, as is the case in many real-world applications, the potential benefit of exploiting them can be substantial. We present the Chebyshev Accelerated Subspace iteration Eigensolver (ChASE), a modern algorithm and library based on subspace iteration with polynomial acceleration. Novel to ChASE is the computation of the spectral estimates that enter in the filter and an optimization of the polynomial degree that further reduces the necessary floating-point operations. ChASE is written in C++ using the modern software engineering concepts that favor a simple integration in application codes and a straightforward portability over heterogeneous platforms. When solving sequences of Hermitian eigenproblems for a portion of their extremal spectrum, ChASE greatly benefits from the sequence’s spectral properties and outperforms direct solvers in many scenarios. The library ships with two distinct parallelization schemes, supports execution over distributed GPUs, and is easily extensible to other parallel computing architectures.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"48 1","pages":"1 - 34"},"PeriodicalIF":0.0,"publicationDate":"2019-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72875971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A QDWH-based SVD Software Framework on Distributed-memory Manycore Systems 基于qdwh的分布式多核系统SVD软件框架
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-04-26 DOI: 10.1145/3309548
D. Sukkari, H. Ltaief, Aniello Esposito, D. Keyes
{"title":"A QDWH-based SVD Software Framework on Distributed-memory Manycore Systems","authors":"D. Sukkari, H. Ltaief, Aniello Esposito, D. Keyes","doi":"10.1145/3309548","DOIUrl":"https://doi.org/10.1145/3309548","url":null,"abstract":"This article presents a high-performance software framework for computing a dense SVD on distributed-memory manycore systems. Originally introduced by Nakatsukasa et al. (2010) and Nakatsukasa and Higham (2013), the SVD solver relies on the polar decomposition using the QR Dynamically Weighted Halley algorithm (QDWH). Although the QDWH-based SVD algorithm performs a significant amount of extra floating-point operations compared to the traditional SVD with the one-stage bidiagonal reduction, the inherent high level of concurrency associated with Level 3 BLAS compute-bound kernels ultimately compensates for the arithmetic complexity overhead. Using the ScaLAPACK two-dimensional block cyclic data distribution with a rectangular processor topology, the resulting QDWH-SVD further reduces excessive communications during the panel factorization, while increasing the degree of parallelism during the update of the trailing submatrix, as opposed to relying on the default square processor grid. After detailing the algorithmic complexity and the memory footprint of the algorithm, we conduct a thorough performance analysis and study the impact of the grid topology on the performance by looking at the communication and computation profiling trade-offs. We report performance results against state-of-the-art existing QDWH software implementations (e.g., Elemental) and their SVD extensions on large-scale distributed-memory manycore systems based on commodity Intel x86 Haswell processors and Knights Landing (KNL) architecture. The QDWH-SVD framework achieves up to 3/8-fold speedups on the Haswell/KNL-based platforms, respectively, against ScaLAPACK PDGESVD and turns out to be a competitive alternative for well- and ill-conditioned matrices. We finally come up herein with a performance model based on these empirical results. Our QDWH-based polar decomposition and its SVD extension are freely available at https://github.com/ecrc/qdwh.git and https://github.com/ecrc/ksvd.git, respectively, and have been integrated into the Cray Scientific numerical library LibSci v17.11.1.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"63 1","pages":"1 - 21"},"PeriodicalIF":0.0,"publicationDate":"2019-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77443572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
JGraphT—A Java Library for Graph Data Structures and Algorithms 一个用于图数据结构和算法的Java库
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2019-04-17 DOI: 10.1145/3381449
D. Michail, Joris Kinable, Barak Naveh, John V. Sichi
{"title":"JGraphT—A Java Library for Graph Data Structures and Algorithms","authors":"D. Michail, Joris Kinable, Barak Naveh, John V. Sichi","doi":"10.1145/3381449","DOIUrl":"https://doi.org/10.1145/3381449","url":null,"abstract":"Mathematical software and graph-theoretical algorithmic packages to efficiently model, analyze, and query graphs are crucial in an era where large-scale spatial, societal, and economic network data are abundantly available. One such package is JGraphT, a programming library that contains very efficient and generic graph data structures along with a large collection of state-of-the-art algorithms. The library is written in Java with stability, interoperability, and performance in mind. A distinctive feature of this library is its ability to model vertices and edges as arbitrary objects, thereby permitting natural representations of many common networks, including transportation, social, and biological networks. Besides classic graph algorithms such as shortest-paths and spanning-tree algorithms, the library contains numerous advanced algorithms: graph and subgraph isomorphism, matching and flow problems, approximation algorithms for NP-hard problems such as independent set and the traveling salesman problem, and several more exotic algorithms such as Berge graph detection. Due to its versatility and generic design, JGraphT is currently used in large-scale commercial products, as well as noncommercial and academic research projects. In this work, we describe in detail the design and underlying structure of the library, and discuss its most important features and algorithms. A computational study is conducted to evaluate the performance of JGraphT versus several similar libraries. Experiments on a large number of graphs over a variety of popular algorithms show that JGraphT is highly competitive with other established libraries such as NetworkX or the BGL.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"4 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2019-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82454026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 81
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信