ACM Transactions on Mathematical Software最新文献

筛选
英文 中文
Cache-oblivious Hilbert Curve-based Blocking Scheme for Matrix Transposition 基于缓存无关Hilbert曲线的矩阵转置阻塞方案
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-19 DOI: https://dl.acm.org/doi/10.1145/3555353
João Nuno Ferreira Alves, Luís Manuel Silveira Russo, Alexandre Francisco
{"title":"Cache-oblivious Hilbert Curve-based Blocking Scheme for Matrix Transposition","authors":"João Nuno Ferreira Alves, Luís Manuel Silveira Russo, Alexandre Francisco","doi":"https://dl.acm.org/doi/10.1145/3555353","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3555353","url":null,"abstract":"<p>This article presents a fast SIMD Hilbert space-filling curve generator, which supports a new cache-oblivious blocking-scheme technique applied to the out-of-place transposition of general matrices. Matrix operations found in high performance computing libraries are usually parameterized based on host microprocessor specifications to minimize data movement within the different levels of memory hierarchy. The performance of cache-oblivious algorithms does not rely on such parameterizations. This type of algorithm provides an elegant and portable solution to address the lack of standardization in modern-day processors. Our solution consists in an iterative blocking scheme that takes advantage of the locality-preserving properties of Hilbert space-filling curves to minimize data movement in any memory hierarchy. This scheme traverses the input matrix, in <i>O(nm)</i> time and space, improving the behavior of matrix algorithms that inherently present poor memory locality. The application of this technique to the problem of out-of-place matrix transposition achieved competitive results when compared to state-of-the-art approaches. The performance of our solution surpassed Intel MKL version after employing standard software prefetching techniques.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"39 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Remark on Algorithm 1010: Boosting Efficiency in Solving Quartic Equations with No Compromise in Accuracy 算法1010:在不影响精度的情况下提高求解四次方程的效率
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-19 DOI: https://dl.acm.org/doi/10.1145/3564270
Cristiano De Michele
{"title":"Remark on Algorithm 1010: Boosting Efficiency in Solving Quartic Equations with No Compromise in Accuracy","authors":"Cristiano De Michele","doi":"https://dl.acm.org/doi/10.1145/3564270","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3564270","url":null,"abstract":"<p>We present a correction and an improvement to Algorithm 1010 [A. Orellana and C. De Michele 2020].</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"34 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with Sacado 基于Sacado的新兴多核体系结构c++代码自动识别
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-19 DOI: https://dl.acm.org/doi/10.1145/3560262
Eric Phipps, Roger Pawlowski, Christian Trott
{"title":"Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with Sacado","authors":"Eric Phipps, Roger Pawlowski, Christian Trott","doi":"https://dl.acm.org/doi/10.1145/3560262","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3560262","url":null,"abstract":"<p>Automatic differentiation (AD) is a well-known technique for evaluating analytic derivatives of calculations implemented on a computer, with numerous software tools available for incorporating AD technology into complex applications. However, a growing challenge for AD is the efficient differentiation of parallel computations implemented on emerging manycore computing architectures such as multicore CPUs, GPUs, and accelerators as these devices become more pervasive. In this work, we explore forward mode, operator overloading-based differentiation of C++ codes on these architectures using the widely available Sacado AD software package. In particular, we leverage Kokkos, a C++ tool providing APIs for implementing parallel computations that is portable to a wide variety of emerging architectures. We describe the challenges that arise when differentiating code for these architectures using Kokkos, and two approaches for overcoming them that ensure optimal memory access patterns as well as expose additional dimensions of fine-grained parallelism in the derivative calculation. We describe the results of several computational experiments that demonstrate the performance of the approach on a few contemporary CPU and GPU architectures. We then conclude with applications of these techniques to the simulation of discretized systems of partial differential equations.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"105 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIRECTGO: A New DIRECT-Type MATLAB Toolbox for Derivative-Free Global Optimization DIRECTGO:一种用于无导数全局优化的新型直接型MATLAB工具箱
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-19 DOI: https://dl.acm.org/doi/10.1145/3559755
Linas Stripinis, Remigijus Paulavičius
{"title":"DIRECTGO: A New DIRECT-Type MATLAB Toolbox for Derivative-Free Global Optimization","authors":"Linas Stripinis, Remigijus Paulavičius","doi":"https://dl.acm.org/doi/10.1145/3559755","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3559755","url":null,"abstract":"<p>In this work, we introduce <monospace>DIRECTGO</monospace>, a new <monospace>MATLAB</monospace> toolbox for derivative-free global optimization. <monospace>DIRECTGO</monospace> collects various deterministic derivative-free <monospace>DIRECT</monospace>-type algorithms for box-constrained, generally constrained, and problems with hidden constraints. Each sequential algorithm is implemented in two ways: using static and dynamic data structures for more efficient information storage and organization. Furthermore, parallel schemes are applied to some promising algorithms within <monospace>DIRECTGO</monospace>. The toolbox is equipped with a graphical user interface (GUI), ensuring the user-friendly use of all functionalities available in <monospace>DIRECTGO</monospace>. Available features are demonstrated in detailed computational studies using a comprehensive <monospace>DIRECTGOLib v1.0</monospace> library of global optimization test problems. Additionally, 11 classical engineering design problems illustrate the potential of <monospace>DIRECTGO</monospace> to solve challenging real-world problems. Finally, the appendix gives examples of accompanying <monospace>MATLAB</monospace> programs and provides a synopsis of its use on the test problems with box and general constraints.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"52 ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Waveform Relaxation with Asynchronous Time-integration 异步时间积分的波形松弛
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-19 DOI: https://dl.acm.org/doi/10.1145/3569578
Peter Meisrimel, Philipp Birken
{"title":"Waveform Relaxation with Asynchronous Time-integration","authors":"Peter Meisrimel, Philipp Birken","doi":"https://dl.acm.org/doi/10.1145/3569578","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3569578","url":null,"abstract":"<p>We consider Waveform Relaxation (WR) methods for parallel and partitioned time-integration of surface-coupled multiphysics problems. WR allows independent time-discretizations on independent and adaptive time-grids, while maintaining high time-integration orders. Classical WR methods such as Jacobi or Gauss-Seidel WR are typically either parallel or converge quickly.</p><p>We present a novel parallel WR method utilizing asynchronous communication techniques to get both properties. Classical WR methods exchange discrete functions after time-integration of a subproblem. We instead asynchronously exchange time-point solutions during time-integration and directly incorporate all new information in the interpolants. We show both continuous and time-discrete convergence in a framework that generalizes existing linear WR convergence theory. An algorithm for choosing optimal relaxation in our new WR method is presented. </p><p>Convergence is demonstrated in two conjugate heat transfer examples. Our new method shows an improved performance over classical WR methods. In one example, we show a partitioned coupling of the compressible Euler equations with a nonlinear heat equation, with subproblems implemented using the open source libraries <monospace>DUNE</monospace> and <monospace>FEniCS</monospace>.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"75 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138537801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm 1034: An Accelerated Algorithm to Compute the Qn Robust Statistic, with Corrections to Constants 算法1034:一种计算Qn鲁棒统计量的加速算法,并对常数进行校正
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-16 DOI: 10.1145/3576920
Thierry Fahmy
{"title":"Algorithm 1034: An Accelerated Algorithm to Compute the Qn Robust Statistic, with Corrections to Constants","authors":"Thierry Fahmy","doi":"10.1145/3576920","DOIUrl":"https://doi.org/10.1145/3576920","url":null,"abstract":"The robust scale estimator Qn developed by Croux and Rousseeuw [3], for the computation of which they provided a deterministic algorithm, has proven to be very useful in several domains including in quality management and time series analysis. It has interesting mathematical (50% breakdown, 82% Asymptotic Relative Efficiency) and computing (O(nlogn) time, O(n) space) properties. While working on a faster algorithm to compute Qn, we have discovered an error in the computation of the d constant, and as a consequence in the dn constants that are used to scale the statistic for consistency with the variance of a normal sample. These errors have been reproduced in several articles including in the International Standard Organisation 13,528 [12] document. In this article, we fix the errors and present a new approach, which includes a new algorithm, allowing computations to run 1.3 to 4.5 times faster when n grows from 10 to 100,000.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":" ","pages":"1 - 12"},"PeriodicalIF":2.7,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46597268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm xxx: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures xxx算法:分布式存储器结构上计算随机线性码最小距离的并行实现
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-12-05 DOI: 10.1145/3573383
G. Quintana-Ortí, Fernando Hernando, F. D. Igual
{"title":"Algorithm xxx: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures","authors":"G. Quintana-Ortí, Fernando Hernando, F. D. Igual","doi":"10.1145/3573383","DOIUrl":"https://doi.org/10.1145/3573383","url":null,"abstract":"\u0000 The minimum distance of a linear code is a key concept in information theory. Therefore, the time required by its computation is very important to many problems in this area. In this paper, we introduce a family of implementations of the Brouwer-Zimmermann algorithm for distributed-memory architectures for computing the minimum distance of a random linear code over\u0000 \u0000 (mathbb {F}_{2} )\u0000 \u0000 . Both current commercial and public-domain software only work on either unicore architectures or shared-memory architectures, which are limited in the number of cores/processors employed in the computation. Our implementations focus on distributed-memory architectures, thus being able to employ hundreds or even thousands of cores in the computation of the minimum distance. Our experimental results show that our implementations are much faster, even up to several orders of magnitude, than current implementations widely used nowadays.\u0000","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45456030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Array-Aware Matching: Taming the Complexity of Large-Scale Simulation Models 阵列感知匹配:驯服大规模仿真模型的复杂性
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-11-22 DOI: 10.1145/3611661
Massimo Fioravanti, Daniele Cattaneo, F. Terraneo, Silvano Seva, Stefano Cherubin, G. Agosta, F. Casella, A. Leva
{"title":"Array-Aware Matching: Taming the Complexity of Large-Scale Simulation Models","authors":"Massimo Fioravanti, Daniele Cattaneo, F. Terraneo, Silvano Seva, Stefano Cherubin, G. Agosta, F. Casella, A. Leva","doi":"10.1145/3611661","DOIUrl":"https://doi.org/10.1145/3611661","url":null,"abstract":"Equation-based modelling is a powerful approach to tame the complexity of large-scale simulation problems. Equation-based tools automatically translate models into imperative languages. When confronted with nowadays’ problems, however, well assessed model translation techniques exhibit scalability issues that are particularly severe when models contain very large arrays. In fact, such models can be made very compact by enclosing equations into looping constructs, but reflecting the same compactness into the translated imperative code is nontrivial. In this paper, we face this issue by concentrating on a key step of equations-to-code translation, the equation/variable matching. We first show that an efficient translation of models with (large) arrays needs awareness of their presence, by defining a figure of merit to measure how much the looping constructs are preserved along the translation. We then show that the said figure of merit allows to define an optimal array-aware matching, and as our main result, that the so stated optimal array-aware matching problem is NP-complete. As an additional result, we propose a heuristic algorithm capable of performing array-aware matching in polynomial time. The proposed algorithm can be proficiently used by model translator developers in the implementation of efficient tools for large-scale system simulation.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"49 1","pages":"1 - 25"},"PeriodicalIF":2.7,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42067557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Algorithm 1031: MQSI—Monotone Quintic Spline Interpolation 算法1031:MQSI——单调五次样条插值
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-11-01 DOI: 10.1145/3570157
T. Lux, L.T. Watson, Tyler H. Chang, W. Thacker
{"title":"Algorithm 1031: MQSI—Monotone Quintic Spline Interpolation","authors":"T. Lux, L.T. Watson, Tyler H. Chang, W. Thacker","doi":"10.1145/3570157","DOIUrl":"https://doi.org/10.1145/3570157","url":null,"abstract":"MQSI is a Fortran 2003 subroutine for constructing monotone quintic spline interpolants to univariate monotone data. Using sharp theoretical monotonicity constraints, first and second derivative estimates at data provided by a quadratic facet model are refined to produce a univariate C2 monotone interpolant. Algorithm and implementation details, complexity and sensitivity analyses, usage information, a brief performance study, and comparisons with other spline approaches are included.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"49 1","pages":"1 - 17"},"PeriodicalIF":2.7,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45404093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Algorithm 1032: Bi-cubic Splines for Polyhedral Control Nets 算法1032:多面体控制网的双三次样条
IF 2.7 1区 数学
ACM Transactions on Mathematical Software Pub Date : 2022-10-31 DOI: 10.1145/3570158
J. Peters, K. Lo, K. Karčiauskas
{"title":"Algorithm 1032: Bi-cubic Splines for Polyhedral Control Nets","authors":"J. Peters, K. Lo, K. Karčiauskas","doi":"10.1145/3570158","DOIUrl":"https://doi.org/10.1145/3570158","url":null,"abstract":"For control nets outlining a large class of topological polyhedra, not just tensor-product grids, bi-cubic polyhedral splines form a piecewise polynomial, first-order differentiable space that associates one function with each vertex. Akin to tensor-product splines, the resulting smooth surface approximates the polyhedron. Admissible polyhedral control nets consist of quadrilateral faces in a grid-like layout, star-configuration where n ≠ 4 quadrilateral faces join around an interior vertex, n-gon configurations, where 2n quadrilaterals surround an n-gon, polar configurations where a cone of n triangles meeting at a vertex is surrounded by a ribbon of n quadrilaterals, and three types of T-junctions where two quad-strips merge into one. The bi-cubic pieces of a polyhedral spline have matching derivatives along their break lines, possibly after a known change of variables. The pieces are represented in Bernstein-Bézier form with coefficients depending linearly on the polyhedral control net, so that evaluation, differentiation, integration, moments, and so on, are no more costly than for standard tensor-product splines. Bi-cubic polyhedral splines can be used both to model geometry and for computing functions on the geometry. Although polyhedral splines do not offer nested refinement by refinement of the control net, polyhedral splines support engineering analysis of curved smooth objects. Coarse nets typically suffice since the splines efficiently model curved features. Algorithm 1032 is a C++ library with input-output example pairs and an IGES output choice.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"49 1","pages":"1 - 12"},"PeriodicalIF":2.7,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41729465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信