arXiv - CS - Mathematical Software最新文献_第3页

Conversion of Boolean and Integer FlatZinc Builtins to Quadratic or Linear Integer Problems 将布尔型和整数型 FlatZinc 内置程序转换为二次或线性整数问题

arXiv - CS - Mathematical Software Pub Date : 2024-04-19 DOI: arxiv-2404.12797

Armin Wolf

引用次数: 0

Robustness and Accuracy in Pipelined Bi-Conjugate Gradient Stabilized Method: A Comparative Study 流水线双共轭梯度稳定法的稳健性和准确性：比较研究

arXiv - CS - Mathematical Software Pub Date : 2024-04-19 DOI: arxiv-2404.13216

Mykhailo Havdiak, Jose I. Aliaga, Roman Iakymchuk

引用次数: 0

GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems GALÆXI：用基于加速器的系统上的高阶非连续伽勒金方法解决复杂可压缩流动问题

arXiv - CS - Mathematical Software Pub Date : 2024-04-19 DOI: arxiv-2404.12703

Daniel Kempf, Marius Kurz, Marcel Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, Andrea Beck

{"title":"GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems","authors":"Daniel Kempf, Marius Kurz, Marcel Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, Andrea Beck","doi":"arxiv-2404.12703","DOIUrl":"https://doi.org/arxiv-2404.12703","url":null,"abstract":"This work presents GAL{AE}XI as a novel, energy-efficient flow solver for\u0000the simulation of compressible flows on unstructured meshes leveraging the\u0000parallel computing power of modern Graphics Processing Units (GPUs). GAL{AE}XI\u0000implements the high-order Discontinuous Galerkin Spectral Element Method\u0000(DGSEM) using shock capturing with a finite-volume subcell approach to ensure\u0000the stability of the high-order scheme near shocks. This work provides details\u0000on the general code design, the parallelization strategy, and the\u0000implementation approach for the compute kernels with a focus on the element\u0000local mappings between volume and surface data due to the unstructured mesh.\u0000GAL{AE}XI exhibits excellent strong scaling properties up to 1024 GPUs if each\u0000GPU is assigned a minimum of one million degrees of freedom degrees of freedom.\u0000To verify its implementation, a convergence study is performed that recovers\u0000the theoretical order of convergence of the implemented numerical schemes.\u0000Moreover, the solver is validated using both the incompressible and\u0000compressible formulation of the Taylor-Green-Vortex at a Mach number of 0.1 and\u00001.25, respectively. A mesh convergence study shows that the results converge to\u0000the high-fidelity reference solution and that the results match the original\u0000CPU implementation. Finally, GAL{AE}XI is applied to a large-scale\u0000wall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37.\u0000Here, the supersonic region and shocks at the leading edge are captured\u0000accurately and robustly by the implemented shock-capturing approach. It is\u0000demonstrated that GAL{AE}XI requires less than half of the energy to carry out\u0000this simulation in comparison to the reference CPU implementation. This renders\u0000GAL{AE}XI as a potent tool for accurate and efficient simulations of\u0000compressible flows in the realm of exascale computing and the associated new\u0000HPC architectures.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140636627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Confirmable Workflows in OSCAR OSCAR 中的可确认工作流程

arXiv - CS - Mathematical Software Pub Date : 2024-04-09 DOI: arxiv-2404.06241

Michael Joswig, Lars Kastner, Benjamin Lorenz

引用次数: 0

SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-V Compute Clusters with Indirect Stream Registers SARIS：利用间接流寄存器在高能效 RISC-V 计算集群上加速模版计算

arXiv - CS - Mathematical Software Pub Date : 2024-04-08 DOI: arxiv-2404.05303

Paul Scheffler, Luca Colagrande, Luca Benini

引用次数: 0

Interactive Formal Specification for Mathematical Problems of Engineers 工程师数学问题的交互式形式化规范

arXiv - CS - Mathematical Software Pub Date : 2024-04-08 DOI: arxiv-2404.05462

Walther NeuperJKU - Johannes Kepler Universität Linz

引用次数: 0

Predefined Software Environment Runtimes As A Measure For Reproducibility 将预定义软件环境运行时间作为衡量可重复性的标准

arXiv - CS - Mathematical Software Pub Date : 2024-04-08 DOI: arxiv-2404.05563

Aaruni Kaushik

{"title":"Predefined Software Environment Runtimes As A Measure For Reproducibility","authors":"Aaruni Kaushik","doi":"arxiv-2404.05563","DOIUrl":"https://doi.org/arxiv-2404.05563","url":null,"abstract":"As part of Mathematical Research Data Initiative (MaRDI), we have developed a\u0000way to preserve a software package into an easy to deploy and use sandbox\u0000environment we call a \"runtime\", via a program we developed called MaPS : MaRDI\u0000Packaging System. The program relies on Linux user namespaces to isolate a\u0000library environment from the host system, making the sandboxed software\u0000reproducible on other systems, with minimal effort. Moreover an overlay\u0000filesystem makes local edits persistent. This project will aid reproducibility\u0000efforts of research papers: both mathematical and from other disciplines. As a\u0000proof of concept, we provide runtimes for the OSCAR Computer Algebra System,\u0000polymake software for research in polyhedral geometry, and VIBRANT Virus\u0000Identification By iteRative ANnoTation. The software is in a prerelease state:\u0000the interface for creating, deploying, and executing runtimes is final, and an\u0000interface for easily publishing runtimes is under active development. We thus\u0000propose publishing predefined, distributable software environment runtimes\u0000along with research papers in an effort to make research with software based\u0000results reproducible.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140560309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A shared compilation stack for distributed-memory parallelism in stencil DSLs 模板 DSL 中分布式内存并行的共享编译栈

arXiv - CS - Mathematical Software Pub Date : 2024-04-02 DOI: arxiv-2404.02218

George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser

{"title":"A shared compilation stack for distributed-memory parallelism in stencil DSLs","authors":"George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser","doi":"arxiv-2404.02218","DOIUrl":"https://doi.org/arxiv-2404.02218","url":null,"abstract":"Domain Specific Languages (DSLs) increase programmer productivity and provide\u0000high performance. Their targeted abstractions allow scientists to express\u0000problems at a high level, providing rich details that optimizing compilers can\u0000exploit to target current- and next-generation supercomputers. The convenience\u0000and performance of DSLs come with significant development and maintenance\u0000costs. The siloed design of DSL compilers and the resulting inability to\u0000benefit from shared infrastructure cause uncertainties around longevity and the\u0000adoption of DSLs at scale. By tailoring the broadly-adopted MLIR compiler\u0000framework to HPC, we bring the same synergies that the machine learning\u0000community already exploits across their DSLs (e.g. Tensorflow, PyTorch) to the\u0000finite-difference stencil HPC community. We introduce new HPC-specific\u0000abstractions for message passing targeting distributed stencil computations. We\u0000demonstrate the sharing of common components across three distinct HPC\u0000stencil-DSL compilers: Devito, PSyclone, and the Open Earth Compiler, showing\u0000that our framework generates high-performance executables based upon a shared\u0000compiler ecosystem.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140560440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Inexactness and Correction of Floating-Point Reciprocal, Division and Square Root 浮点倒数、除法和平方根的不精确性和修正

arXiv - CS - Mathematical Software Pub Date : 2024-03-30 DOI: arxiv-2404.00387

Lucas M. Dutton, Christopher Kumar Anand, Robert Enenkel, Silvia Melitta Müller

{"title":"Inexactness and Correction of Floating-Point Reciprocal, Division and Square Root","authors":"Lucas M. Dutton, Christopher Kumar Anand, Robert Enenkel, Silvia Melitta Müller","doi":"arxiv-2404.00387","DOIUrl":"https://doi.org/arxiv-2404.00387","url":null,"abstract":"Floating-point arithmetic performance determines the overall performance of\u0000important applications, from graphics to AI. Meeting the IEEE-754 specification\u0000for floating-point requires that final results of addition, subtraction,\u0000multiplication, division, and square root are correctly rounded based on the\u0000user-selected rounding mode. A frustrating fact for implementers is that naive\u0000rounding methods will not produce correctly rounded results even when\u0000intermediate results with greater accuracy and precision are available. In\u0000contrast, our novel algorithm can correct approximations of reciprocal,\u0000division and square root, even ones with slightly lower than target precision.\u0000In this paper, we present a family of algorithms that can both increase the\u0000accuracy (and potentially the precision) of an estimate and correctly round it\u0000according to all binary IEEE-754 rounding modes. We explain how it may be\u0000efficiently implemented in hardware, and for completeness, we present proofs\u0000that it is not necessary to include equality tests associated with\u0000round-to-nearest-even mode for reciprocal, division and square root functions,\u0000because it is impossible for input(s) in a given precision to have exact\u0000answers exactly midway between representable floating-point numbers in that\u0000precision. In fact, our simpler proofs are sometimes stronger.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140560307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Lie Group Approach to Riemannian Batch Normalization 黎曼批量归一化的李群方法

arXiv - CS - Mathematical Software Pub Date : 2024-03-17 DOI: arxiv-2403.11261

Ziheng Chen, Yue Song, Yunmei Liu, Nicu Sebe

{"title":"A Lie Group Approach to Riemannian Batch Normalization","authors":"Ziheng Chen, Yue Song, Yunmei Liu, Nicu Sebe","doi":"arxiv-2403.11261","DOIUrl":"https://doi.org/arxiv-2403.11261","url":null,"abstract":"Manifold-valued measurements exist in numerous applications within computer\u0000vision and machine learning. Recent studies have extended Deep Neural Networks\u0000(DNNs) to manifolds, and concomitantly, normalization techniques have also been\u0000adapted to several manifolds, referred to as Riemannian normalization.\u0000Nonetheless, most of the existing Riemannian normalization methods have been\u0000derived in an ad hoc manner and only apply to specific manifolds. This paper\u0000establishes a unified framework for Riemannian Batch Normalization (RBN)\u0000techniques on Lie groups. Our framework offers the theoretical guarantee of\u0000controlling both the Riemannian mean and variance. Empirically, we focus on\u0000Symmetric Positive Definite (SPD) manifolds, which possess three distinct types\u0000of Lie group structures. Using the deformation concept, we generalize the\u0000existing Lie groups on SPD manifolds into three families of parameterized Lie\u0000groups. Specific normalization layers induced by these Lie groups are then\u0000proposed for SPD neural networks. We demonstrate the effectiveness of our\u0000approach through three sets of experiments: radar recognition, human action\u0000recognition, and electroencephalography (EEG) classification. The code is\u0000available at https://github.com/GitZH-Chen/LieBN.git.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140169905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0