ACM Transactions on Mathematical Software (TOMS)最新文献

筛选
英文 中文
A Computational Study of Using Black-box QR Solvers for Large-scale Sparse-dense Linear Least Squares Problems 大规模稀疏密集线性最小二乘问题黑箱QR解的计算研究
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-16 DOI: 10.1145/3494527
J. Scott, M. Tuma
{"title":"A Computational Study of Using Black-box QR Solvers for Large-scale Sparse-dense Linear Least Squares Problems","authors":"J. Scott, M. Tuma","doi":"10.1145/3494527","DOIUrl":"https://doi.org/10.1145/3494527","url":null,"abstract":"Large-scale overdetermined linear least squares problems arise in many practical applications. One popular solution method is based on the backward stable QR factorization of the system matrix A. This article focuses on sparse-dense least squares problems in which A is sparse except from a small number of rows that are considered dense. For large-scale problems, the direct application of a QR solver either fails because of insufficient memory or is unacceptably slow. We study several solution approaches based on using a sparse QR solver without modification, focussing on the case that the sparse part of A is rank deficient. We discuss partial matrix stretching and regularization and propose extending the augmented system formulation with iterative refinement for sparse problems to sparse-dense problems, optionally incorporating multi-precision arithmetic. In summary, our computational study shows that, before applying a black-box QR factorization, a check should be made for rows that are classified as dense and, if such rows are identified, then A should be split into sparse and dense blocks; a number of ways to use a black-box QR factorization to exploit this splitting are possible, with no single method found to be the best in all cases.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"30 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91194345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Source-to-Source Automatic Differentiation of OpenMP Parallel Loops OpenMP并行环路的源到源自动分化
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-16 DOI: 10.1145/3472796
J. Hückelheim, L. Hascoët
{"title":"Source-to-Source Automatic Differentiation of OpenMP Parallel Loops","authors":"J. Hückelheim, L. Hascoët","doi":"10.1145/3472796","DOIUrl":"https://doi.org/10.1145/3472796","url":null,"abstract":"This article presents our work toward correct and efficient automatic differentiation of OpenMP parallel worksharing loops in forward and reverse mode. Automatic differentiation is a method to obtain gradients of numerical programs, which are crucial in optimization, uncertainty quantification, and machine learning. The computational cost to compute gradients is a common bottleneck in practice. For applications that are parallelized for multicore CPUs or GPUs using OpenMP, one also wishes to compute the gradients in parallel. We propose a framework to reason about the correctness of the generated derivative code, from which we justify our OpenMP extension to the differentiation model. We implement this model in the automatic differentiation tool Tapenade and present test cases that are differentiated following our extended differentiation procedure. Performance of the generated derivative programs in forward and reverse mode is better than sequential, although our reverse mode often scales worse than the input programs.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"41 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77489374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Kummer versus Montgomery Face-off over Prime Order Fields Kummer和Montgomery在Prime Order Fields的对决
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2022-02-11 DOI: 10.1145/3503536
K. Nath, P. Sarkar
{"title":"Kummer versus Montgomery Face-off over Prime Order Fields","authors":"K. Nath, P. Sarkar","doi":"10.1145/3503536","DOIUrl":"https://doi.org/10.1145/3503536","url":null,"abstract":"This paper makes a comprehensive comparison of the efficiencies of vectorized implementations of Kummer lines and Montgomery curves at various security levels. For the comparison, nine Kummer lines are considered, out of which eight are new, and new assembly implementations of all nine Kummer lines have been made. Seven previously proposed Montgomery curves are considered and new vectorized assembly implementations have been made for three of them. Our comparisons show that for all security levels, Kummer lines are consistently faster than Montgomery curves, though the speed-up gap is not much.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"44 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89615101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Remark on Algorithm 982: Explicit Solutions of Triangular Systems of First-order Linear Initial-value Ordinary Differential Equations with Constant Coefficients 算法982:一阶常系数线性初值常微分方程三角方程组的显式解
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-12-15 DOI: 10.1145/3479429
W. Van Snyder
{"title":"Remark on Algorithm 982: Explicit Solutions of Triangular Systems of First-order Linear Initial-value Ordinary Differential Equations with Constant Coefficients","authors":"W. Van Snyder","doi":"10.1145/3479429","DOIUrl":"https://doi.org/10.1145/3479429","url":null,"abstract":"Algorithm 982: Explicit solutions of triangular systems of first-order linear initial-value ordinary differential equations with constant coefficients provides an explicit solution for an homogeneous system, and a brief description of how to compute a solution for the inhomogeneous case. The method described is not directly useful if the coefficient matrix is singular. This remark explains more completely how to compute the solution for the inhomogeneous case and for the singular coefficient matrix case.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"50 1","pages":"1 - 4"},"PeriodicalIF":0.0,"publicationDate":"2021-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86013263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm 1018: FaVeST—Fast Vector Spherical Harmonic Transforms 算法1018:最快的矢量球谐变换
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-09-28 DOI: 10.1145/3458470
Q. L. Le Gia, Ming Li, Yu Guang Wang
{"title":"Algorithm 1018: FaVeST—Fast Vector Spherical Harmonic Transforms","authors":"Q. L. Le Gia, Ming Li, Yu Guang Wang","doi":"10.1145/3458470","DOIUrl":"https://doi.org/10.1145/3458470","url":null,"abstract":"Vector spherical harmonics on the unit sphere of ℝ3 have broad applications in geophysics, quantum mechanics, and astrophysics. In the representation of a tangent vector field, one needs to evaluate the expansion and the Fourier coefficients of vector spherical harmonics. In this article, we develop fast algorithms (FaVeST) for vector spherical harmonic transforms on these evaluations. The forward FaVeST evaluates the Fourier coefficients and has a computational cost proportional to N log √N for N number of evaluation points. The adjoint FaVeST, which evaluates a linear combination of vector spherical harmonics with a degree up to ⊡M for M evaluation points, has cost proportional to M log √M. Numerical examples of simulated tangent fields illustrate the accuracy, efficiency, and stability of FaVeST.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"20 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80774026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Corrigendum: Remark on Algorithm 723: Fresnel Integrals 勘误:关于算法723:菲涅耳积分的注释
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-09-28 DOI: 10.1145/3452336
W. Van Snyder
{"title":"Corrigendum: Remark on Algorithm 723: Fresnel Integrals","authors":"W. Van Snyder","doi":"10.1145/3452336","DOIUrl":"https://doi.org/10.1145/3452336","url":null,"abstract":"There are mistakes and typographical errors in Remark on Algorithm 723: Fresnel Integrals, which appeared in ACM Transactions on Mathematical Software 22, 4 (December 1996). This remark corrects those errors. The software provided to Collected Algorithms of the ACM was correct.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"24 1","pages":"1 - 1"},"PeriodicalIF":0.0,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82086155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Matching Pursuit with Multi-Gabor Dictionaries 基于多gabor词典的快速匹配追踪
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-06-25 DOI: 10.1145/3447958
Zdeněk Průša, N. Holighaus, Péter Balázs
{"title":"Fast Matching Pursuit with Multi-Gabor Dictionaries","authors":"Zdeněk Průša, N. Holighaus, Péter Balázs","doi":"10.1145/3447958","DOIUrl":"https://doi.org/10.1145/3447958","url":null,"abstract":"Finding the best K-sparse approximation of a signal in a redundant dictionary is an NP-hard problem. Suboptimal greedy matching pursuit algorithms are generally used for this task. In this work, we present an acceleration technique and an implementation of the matching pursuit algorithm acting on a multi-Gabor dictionary, i.e., a concatenation of several Gabor-type time-frequency dictionaries, each of which consists of translations and modulations of a possibly different window and time and frequency shift parameters. The technique is based on pre-computing and thresholding inner products between atoms and on updating the residual directly in the coefficient domain, i.e., without the round-trip to the signal domain. Since the proposed acceleration technique involves an approximate update step, we provide theoretical and experimental results illustrating the convergence of the resulting algorithm. The implementation is written in C (compatible with C99 and C++11), and we also provide Matlab and GNU Octave interfaces. For some settings, the implementation is up to 70 times faster than the standard Matching Pursuit Toolkit.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"68 1-2 1","pages":"1 - 20"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78185247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
NEP
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-06-25 DOI: 10.1145/3447544
C. Campos, J. Román
{"title":"NEP","authors":"C. Campos, J. Román","doi":"10.1145/3447544","DOIUrl":"https://doi.org/10.1145/3447544","url":null,"abstract":"SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. Over the past few years, we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These problems can be defined by means of a matrix-valued function that depends nonlinearly on a single scalar parameter. We do not consider the particular case of polynomial eigenvalue problems (which are implemented in a different module in SLEPc) and focus here on rational eigenvalue problems and other general nonlinear eigenproblems involving square roots or any other nonlinear function. The article discusses how the NEP module has been designed to fit the needs of applications and provides a description of the available solvers, including some implementation details such as parallelization. Several test problems coming from real applications are used to evaluate the performance and reliability of the solvers.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"27 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75127428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
HyperNOMAD HyperNOMAD
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-06-25 DOI: 10.1145/3450975
Dounia Lakhmiri, Sébastien Le Digabel, C. Tribes
{"title":"HyperNOMAD","authors":"Dounia Lakhmiri, Sébastien Le Digabel, C. Tribes","doi":"10.1145/3450975","DOIUrl":"https://doi.org/10.1145/3450975","url":null,"abstract":"The performance of deep neural networks is highly sensitive to the choice of the hyperparameters that define the structure of the network and the learning process. When facing a new application, tuning a deep neural network is a tedious and time-consuming process that is often described as a “dark art.” This explains the necessity of automating the calibration of these hyperparameters. Derivative-free optimization is a field that develops methods designed to optimize time-consuming functions without relying on derivatives. This work introduces the HyperNOMAD package, an extension of the NOMAD software that applies the MADS algorithm [7] to simultaneously tune the hyperparameters responsible for both the architecture and the learning process of a deep neural network (DNN). This generic approach allows for an important flexibility in the exploration of the search space by taking advantage of categorical variables. HyperNOMAD is tested on the MNIST, Fashion-MNIST, and CIFAR-10 datasets and achieves results comparable to the current state of the art.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"33 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87451416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
PLANC
ACM Transactions on Mathematical Software (TOMS) Pub Date : 2021-06-25 DOI: 10.1145/3432185
Srinivas Eswar, Koby Hayashi, Grey Ballard, R. Kannan, Michael A. Matheson, Haesun Park
{"title":"PLANC","authors":"Srinivas Eswar, Koby Hayashi, Grey Ballard, R. Kannan, Michael A. Matheson, Haesun Park","doi":"10.1145/3432185","DOIUrl":"https://doi.org/10.1145/3432185","url":null,"abstract":"We consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Nonnegativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.","PeriodicalId":7036,"journal":{"name":"ACM Transactions on Mathematical Software (TOMS)","volume":"83 1","pages":"1 - 37"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80972369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信