ACM Transactions on Mathematical Software (TOMS)最新文献_第5页

Algorithm 1012 算法1012

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-11-07 DOI: 10.1145/3422818

Tyler H. Chang, L. Watson, T. Lux, A. Butt, K. Cameron, Yili Hong

引用次数: 5

Replicated Computational Results (RCR) Report for “Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software” “银杏线性代数软件中用于高性能预处理的自适应精确块-雅可比”的重复计算结果(RCR)报告

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-10-27 DOI: 10.1145/3446000

S. Osborn

引用次数: 0

Formalization of Double-Word Arithmetic, and Comments on “Tight and Rigorous Error Bounds for Basic Building Blocks of Double-Word Arithmetic” 双字算法的形式化及对“双字算法基本构件的严格与严格误差界”的评析

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-10-20 DOI: 10.1145/3484514

J. Muller, L. Rideau

引用次数: 3

A Feature-complete SPIKE Dense Banded Solver 一个功能完整的钉密集带状求解器

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-10-16 DOI: 10.1145/3410153

Braegan S. Spring, E. Polizzi, A. Sameh

引用次数: 1

Variable Step-Size Control Based on Two-Steps for Radau IIA Methods 基于两步Radau IIA方法的变步长控制

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-10-16 DOI: 10.1145/3408892

S. G. Pinto, D. H. Abreu, J. I. Montijano

引用次数: 1

PHIST

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-10-16 DOI: 10.1145/3402227

J. Thies, Melven Röhrig-Zöllner, N. Overmars, A. Basermann, Dominik Ernst, G. Hager, G. Wellein

{"title":"PHIST","authors":"J. Thies, Melven Röhrig-Zöllner, N. Overmars, A. Basermann, Dominik Ernst, G. Hager, G. Wellein","doi":"10.1145/3402227","DOIUrl":"https://doi.org/10.1145/3402227","url":null,"abstract":"The increasing complexity of hardware and software environments in high-performance computing poses big challenges on the development of sustainable and hardware-efficient numerical software. This article addresses these challenges in the context of sparse solvers. Existing solutions typically target sustainability, flexibility, or performance, but rarely all of them. Our new library PHIST provides implementations of solvers for sparse linear systems and eigenvalue problems. It is a productivity platform for performance-aware developers of algorithms and application software with abstractions that do not obscure the view on hardware-software interaction. The PHIST software architecture and the PHIST development process were designed to overcome shortcomings of existing packages. An interface layer for basic sparse linear algebra functionality that can be provided by multiple backends ensures sustainability, and PHIST supports common techniques for improving scalability and performance of algorithms such as blocking and kernel fusion. We showcase these concepts using the PHIST implementation of a block Jacobi-Davidson solver for non-Hermitian and generalized eigenproblems. We study its performance on a multi-core CPU, a GPU, and a large-scale many-core system. Furthermore, we show how an existing implementation of a block Krylov-Schur method in the Trilinos package Anasazi can benefit from the performance engineering techniques used in PHIST.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"283 1","pages":"1 - 26"},"PeriodicalIF":0.0,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73398264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Algorithm 1011 算法1011

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-09-15 DOI: 10.1145/3408891

Thomas Mejstrik

引用次数: 13

Polynomial Evaluation on Superscalar Architecture, Applied to the Elementary Function ex 标量结构的多项式求值，应用于初等函数ex

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-09-15 DOI: 10.1145/3408893

Timothée Ewart, Francesco Cremonesi, F. Schürmann, F. Delalondre

引用次数: 5

BiqBin: A Parallel Branch-and-bound Solver for Binary Quadratic Problems with Linear Constraints 具有线性约束的二元二次问题的并行分支定界求解器

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-09-14 DOI: 10.1145/3514039

Nicoló Gusmeroli, T. Hrga, Borut Lužar, J. Povh, Melanie Siebenhofer, Angelika Wiegele

{"title":"BiqBin: A Parallel Branch-and-bound Solver for Binary Quadratic Problems with Linear Constraints","authors":"Nicoló Gusmeroli, T. Hrga, Borut Lužar, J. Povh, Melanie Siebenhofer, Angelika Wiegele","doi":"10.1145/3514039","DOIUrl":"https://doi.org/10.1145/3514039","url":null,"abstract":"We present BiqBin, an exact solver for linearly constrained binary quadratic problems. Our approach is based on an exact penalty method to first efficiently transform the original problem into an instance of Max-Cut, and then to solve the Max-Cut problem by a branch-and-bound algorithm. All the main ingredients are carefully developed using new semidefinite programming relaxations obtained by strengthening the existing relaxations with a set of hypermetric inequalities, applying the bundle method as the bounding routine and using new strategies for exploring the branch-and-bound tree. Furthermore, an efficient C implementation of a sequential and a parallel branch-and-bound algorithm is presented. The latter is based on a load coordinator-worker scheme using MPI for multi-node parallelization and is evaluated on a high-performance computer. The new solver is benchmarked against BiqCrunch, GUROBI, and SCIP on four families of (linearly constrained) binary quadratic problems. Numerical results demonstrate that BiqBin is a highly competitive solver. The serial version outperforms the other three solvers on the majority of the benchmark instances. We also evaluate the parallel solver and show that it has good scaling properties. The general audience can use it as an on-line service available at http://www.biqbin.eu.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"1 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88629732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Algorithms for Efficient Reproducible Floating Point Summation 高效可重复浮点求和算法

ACM Transactions on Mathematical Software (TOMS) Pub Date : 2020-07-21 DOI: 10.1145/3389360

Peter Ahrens, J. Demmel, Hong Diep Nguyen

{"title":"Algorithms for Efficient Reproducible Floating Point Summation","authors":"Peter Ahrens, J. Demmel, Hong Diep Nguyen","doi":"10.1145/3389360","DOIUrl":"https://doi.org/10.1145/3389360","url":null,"abstract":"We define “reproducibility” as getting bitwise identical results from multiple runs of the same program, perhaps with different hardware resources or other changes that should not affect the answer. Many users depend on reproducibility for debugging or correctness. However, dynamic scheduling of parallel computing resources, combined with nonassociative floating point addition, makes reproducibility challenging even for summation, or operations like the BLAS. We describe a “reproducible accumulator” data structure (the “binned number”) and associated algorithms to reproducibly sum binary floating point numbers, independent of summation order. We use a subset of the IEEE Floating Point Standard 754-2008 and bitwise operations on the standard representations in memory. Our approach requires only one read-only pass over the data, and one reduction in parallel, using a 6-word reproducible accumulator (more words can be used for higher accuracy), enabling standard tiling optimization techniques. Summing n words with a 6-word reproducible accumulator requires approximately 9n floating point operations (arithmetic, comparison, and absolute value) and approximately 3n bitwise operations. The final error bound with a 6-word reproducible accumulator and our default settings can be up to 229 times smaller than the error bound for conventional (recursive) summation on ill-conditioned double-precision inputs.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"56 1","pages":"1 - 49"},"PeriodicalIF":0.0,"publicationDate":"2020-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74553113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9