Jacob Helwig, Sutanoy Dasgupta, Peng Zhao, Bani K. Mallick, Debdeep Pati
{"title":"Algorithm xxx: A Covariate-Dependent Approach to Gaussian Graphical Modeling in R","authors":"Jacob Helwig, Sutanoy Dasgupta, Peng Zhao, Bani K. Mallick, Debdeep Pati","doi":"10.1145/3659206","DOIUrl":"https://doi.org/10.1145/3659206","url":null,"abstract":"<p>Graphical models are used to capture complex multivariate relationships and have applications in diverse disciplines such as in biology, physics, and economics. Within this field, Gaussian graphical models aim to identify the pairs of variables whose dependence is maintained even after conditioning on the remaining variables in the data, known as the <i>conditional dependence structure</i> of the data. There are many existing software packages for Gaussian graphical modeling, however, they often make restrictive assumptions that reduce their flexibility for modeling data that are not identically distributed. Conversely, <monospace>covdepGE</monospace> is a R implementation of a variational weighted pseudo-likelihood algorithm for modeling the conditional dependence structure as a continuous function of an extraneous covariate. To build on the efficiency of this algorithm, <monospace>covdepGE</monospace> leverages parallelism and C++ integration with R. Additionally, <monospace>covdepGE</monospace> provides fully-automated and data-driven hyperparameter specification while maintaining flexibility for the user to decide key components of the estimation procedure. Through an extensive simulation study spanning diverse settings, <monospace>covdepGE</monospace> is demonstrated to be top of its class in recovering the ground-truth conditional dependence structure while efficiently managing computational overhead.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"98 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140835555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tyler H. Chang, Layne T. Watson, Sven Leyffer, Thomas C. H. Lux, Hussain M. J. Almohri
{"title":"Remark on Algorithm 1012: Computing projections with large data sets","authors":"Tyler H. Chang, Layne T. Watson, Sven Leyffer, Thomas C. H. Lux, Hussain M. J. Almohri","doi":"10.1145/3656581","DOIUrl":"https://doi.org/10.1145/3656581","url":null,"abstract":"<p>In ACM TOMS Algorithm 1012, the <monospace>DELAUNAYSPARSE</monospace> software is given for performing Delaunay interpolation in medium to high dimensions. When extrapolating outside the convex hull of the training set, <monospace>DELAUNAYSPARSE</monospace> calls the nonnegative least squares solver <monospace>DWNNLS</monospace> to compute projections onto the convex hull. However, <monospace>DWNNLS</monospace> and many other available sum of squares optimization solvers were not intended for usage with many variable problems, which result from the large training sets that are typical in machine learning applications. Thus, a new <monospace>PROJECT</monospace> subroutine is given, based on the highly customizable quadratic program solver <monospace>BQPD</monospace>. This solution is shown to be as robust as <monospace>DELAUNAYSPARSE</monospace> for projection onto both synthetic and real-world data sets, where other available solvers frequently fail. Although it is intended as an update for <monospace>DELAUNAYSPARSE</monospace>, due to the difficulty and prevalence of the problem, this solution is likely to be of external interest as well.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"24 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140634395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments","authors":"Abhijit Chowdhary, Shady E. Ahmed, Ahmed Attia","doi":"10.1145/3653071","DOIUrl":"https://doi.org/10.1145/3653071","url":null,"abstract":"<p>This paper describes PyOED, a highly extensible scientific package that enables developing and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive <i>Python toolkit</i> for <i>model-constrained OED</i>. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to “enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization.” This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"144 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140199067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Avoiding breakdown in incomplete factorizations in low precision arithmetic","authors":"Jennifer Scott, Miroslav Tůma","doi":"10.1145/3651155","DOIUrl":"https://doi.org/10.1145/3651155","url":null,"abstract":"<p>The emergence of low precision floating-point arithmetic in computer hardware has led to a resurgence of interest in the use of mixed precision numerical linear algebra. For linear systems of equations, there has been renewed enthusiasm for mixed precision variants of iterative refinement. We consider the iterative solution of large sparse systems using incomplete factorization preconditioners. The focus is on the robust computation of such preconditioners in half precision arithmetic and employing them to solve symmetric positive definite systems to higher precision accuracy; however, the proposed ideas can be applied more generally. Even for well-conditioned problems, incomplete factorizations can break down when small entries occur on the diagonal during the factorization. When using half precision arithmetic, overflows are an additional possible source of breakdown. We examine how breakdowns can be avoided and we implement our strategies within new half precision Fortran sparse incomplete Cholesky factorization software. Results are reported for a range of problems from practical applications. These demonstrate that, even for highly ill-conditioned problems, half precision preconditioners can potentially replace double precision preconditioners, although unsurprisingly this may be at the cost of additional iterations of a Krylov solver.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"133 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140129291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexis Arnaudon, Dominik J. Schindler, Robert L. Peach, Adam Gosztolai, Maxwell Hodges, Michael T. Schaub, Mauricio Barahona
{"title":"Algorithm xxx: PyGenStability, a multiscale community detection with generalized Markov Stability","authors":"Alexis Arnaudon, Dominik J. Schindler, Robert L. Peach, Adam Gosztolai, Maxwell Hodges, Michael T. Schaub, Mauricio Barahona","doi":"10.1145/3651225","DOIUrl":"https://doi.org/10.1145/3651225","url":null,"abstract":"<p>We present PyGenStability, a general-use Python software package that provides a suite of analysis and visualisation tools for unsupervised multiscale community detection in graphs. PyGenStability finds optimized partitions of a graph at different levels of resolution by maximizing the generalized Markov Stability quality function with the Louvain or Leiden algorithms. The package includes automatic detection of robust graph partitions and allows the flexibility to choose quality functions for weighted undirected, directed and signed graphs, and to include other user-defined quality functions.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"61 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140107792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aryan Eftekhari, Lisa Gaedke-Merzhäuser, Dimosthenis Pasadakis, Matthias Bollhöfer, Simon Scheidegger, Olaf Schenk
{"title":"Algorithm XXX: Sparse Precision Matrix Estimation With SQUIC","authors":"Aryan Eftekhari, Lisa Gaedke-Merzhäuser, Dimosthenis Pasadakis, Matthias Bollhöfer, Simon Scheidegger, Olaf Schenk","doi":"10.1145/3650108","DOIUrl":"https://doi.org/10.1145/3650108","url":null,"abstract":"<p>We present <monospace>SQUIC</monospace>, a fast and scalable package for sparse precision matrix estimation. The algorithm employs a second-order method to solve the (ell_{1})-regularized maximum likelihood problem, utilizing highly optimized linear algebra subroutines. In comparative tests using synthetic datasets, we demonstrate that <monospace>SQUIC</monospace> not only scales to datasets of up to a million random variables but also consistently delivers run times that are significantly faster than other well-established sparse precision matrix estimation packages. Furthermore, we showcase the application of the introduced package in classifying microarray gene expressions. We demonstrate that by utilizing a matrix form of the tuning parameter (also known as the regularization parameter), <monospace>SQUIC</monospace> can effectively incorporate prior information into the estimation procedure, resulting in improved application results with minimal computational overhead.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"21 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Re-Materialization Strategies for Heterogeneous Chains: How to Train Deep Neural Networks with Limited Memory","authors":"Olivier Beaumont, Lionel Eyraud-Dubois, Julien Herrmann, Alexis Joly, Alena Shilova","doi":"10.1145/3648633","DOIUrl":"https://doi.org/10.1145/3648633","url":null,"abstract":"<p>Training in Feed Forward Deep Neural Networks is a memory-intensive operation which is usually performed on GPUs with limited memory capacities. This may force data scientists to limit the depth of the models or the resolution of the input data if data does not fit in the GPU memory. The re-materialization technique, whose idea comes from the checkpointing strategies developed in the Automatic Differentiation literature, allows data scientists to limit the memory requirements related to the storage of intermediate data (activations), at the cost of an increase in the computational cost.</p><p>This paper introduces a new strategy of re-materialization of activations that significantly reduces memory usage. It consists in selecting which activations are saved and which activations are deleted during the forward phase, and then recomputing the deleted activations when they are needed during the backward phase.</p><p>We propose an original computation model that combines two types of activation savings: either only storing the layer inputs, or recording the complete history of operations that produced the outputs. This paper focuses on the fully heterogeneous case, where the computation time and the memory requirement of each layer is different. We prove that finding the optimal solution is NP-hard and that classical techniques from Automatic Differentiation literature do not apply. Moreover, the classical assumption of memory persistence of materialized activations, used to simplify the search of optimal solutions, does not hold anymore. Thus, we propose a weak memory persistence property and provide a Dynamic Program to compute the optimal sequence of computations.</p><p>This algorithm is made available through the <span>Rotor</span> software, a PyTorch plug-in dealing with any network consisting of a sequence of layers, each of them having an arbitrarily complex structure. Through extensive experiments, we show that our implementation consistently outperforms existing re-materialization approaches for a large class of networks, image sizes and batch sizes.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"44 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hermitian Dynamic Mode Decomposition - numerical analysis and software solution","authors":"Zlatko Drmač","doi":"10.1145/3641884","DOIUrl":"https://doi.org/10.1145/3641884","url":null,"abstract":"<p>The Dynamic Mode Decomposition (DMD) is a versatile and increasingly popular method for data driven analysis of dynamical systems that arise in a variety of applications in e.g. computational fluid dynamics, robotics or machine learning. In the framework of numerical linear algebra, it is a data driven Rayleigh-Ritz procedure applied to a DMD matrix that is derived from the supplied data. In some applications, the physics of the underlying problem implies hermiticity of the DMD matrix, so the general DMD procedure is not computationally optimal. Furthermore, it does not guarantee important structural properties of the Hermitian eigenvalue problem and may return non-physical solutions. This paper proposes a software solution to the Hermitian (including the real symmetric) DMD matrices, accompanied with a numerical analysis that contains several fine and instructive numerical details. The eigenpairs are computed together with their residuals, and perturbation theory provides error bounds for the eigenvalues and eigenvectors. The software is developed and tested using the <sans-serif>LAPACK</sans-serif> package.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"274 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139581714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A LAPACK implementation of the Dynamic Mode Decomposition","authors":"Zlatko Drmač","doi":"10.1145/3640012","DOIUrl":"https://doi.org/10.1145/3640012","url":null,"abstract":"<p>The Dynamic Mode Decomposition (DMD) is a method for computational analysis of nonlinear dynamical systems in data driven scenarios. Based on high fidelity numerical simulations or experimental data, the DMD can be used to reveal the latent structures in the dynamics or as a forecasting or a model order reduction tool. The theoretical underpinning of the DMD is the Koopman operator on a Hilbert space of observables of the dynamics under study. This paper describes a numerically robust and versatile variant of the DMD and its implementation using the state of the art dense numerical linear algebra software package <sans-serif>LAPACK</sans-serif>. The features of the proposed software solution include residual bounds for the computed eigenpairs of the DMD matrix, eigenvectors refinements and computation of the eigenvectors of the Exact DMD, compressed DMD for efficient analysis of high dimensional problems that can be easily adapted for fast updates in a streaming DMD. Numerical analysis is the bedrock of numerical robustness and reliability of the software, that is tested following the highest standards and practices of <sans-serif>LAPACK</sans-serif>. Important numerical topics are discussed in detail and illustrated using numerous numerical examples.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"12 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139499431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, Enrique S. Quintana-Ortí
{"title":"Algorithm XXX: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM","authors":"Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, Enrique S. Quintana-Ortí","doi":"10.1145/3638532","DOIUrl":"https://doi.org/10.1145/3638532","url":null,"abstract":"<p>We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS and OpenBLAS, in order to obtain high-performance blocked formulations of the general matrix multiplication (<span>gemm</span>). In addition, we fully automatize the generation process, by also leveraging the Apache TVM framework to derive a complete variety of the processor-specific micro-kernels for <span>gemm</span>. This is in contrast with the convention in high performance libraries, which hand-encode a single micro-kernel per architecture using Assembly code. In global, the combination of our TVM-generated blocked algorithms and micro-kernels for <span>gemm</span>\u00001) improves portability, maintainability and, globally, streamlines the software life cycle; 2) provides high flexibility to easily tailor and optimize the solution to different data types, processor architectures, and matrix operand shapes, yielding performance on a par (or even superior for specific matrix shapes) with that of hand-tuned libraries; and 3) features a small memory footprint.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"66 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}