arXiv - CS - Mathematical Software最新文献

筛选
英文 中文
Flexible Multi-Dimensional FFTs for Plane Wave Density Functional Theory Codes 平面波密度函数论代码的灵活多维 FFT
arXiv - CS - Mathematical Software Pub Date : 2024-06-08 DOI: arxiv-2406.05577
Doru Thom Popovici, Mauro del Ben, Osni Marques, Andrew Canning
{"title":"Flexible Multi-Dimensional FFTs for Plane Wave Density Functional Theory Codes","authors":"Doru Thom Popovici, Mauro del Ben, Osni Marques, Andrew Canning","doi":"arxiv-2406.05577","DOIUrl":"https://doi.org/arxiv-2406.05577","url":null,"abstract":"Multi-dimensional Fourier transforms are key mathematical building blocks\u0000that appear in a wide range of applications from materials science, physics,\u0000chemistry and even machine learning. Over the past years, a multitude of\u0000software packages targeting distributed multi-dimensional Fourier transforms\u0000have been developed. Most variants attempt to offer efficient implementations\u0000for single transforms applied on data mapped onto rectangular grids. However,\u0000not all scientific applications conform to this pattern, i.e. plane wave\u0000Density Functional Theory codes require multi-dimensional Fourier transforms\u0000applied on data represented as batches of spheres. Typically, the\u0000implementations for this use case are hand-coded and tailored for the\u0000requirements of each application. In this work, we present the Fastest Fourier\u0000Transform from Berkeley (FFTB) a distributed framework that offers flexible\u0000implementations for both regular/non-regular data grids and batched/non-batched\u0000transforms. We provide a flexible implementations with a user-friendly API that\u0000captures most of the use cases. Furthermore, we provide implementations for\u0000both CPU and GPU platforms, showing that our approach offers improved execution\u0000time and scalability on the HP Cray EX supercomputer. In addition, we outline\u0000the need for flexible implementations for different use cases of the software\u0000package.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
svds-C: A Multi-Thread C Code for Computing Truncated Singular Value Decomposition svds-C:计算截断奇异值分解的多线程 C 代码
arXiv - CS - Mathematical Software Pub Date : 2024-05-29 DOI: arxiv-2405.18966
Xu Feng, Wenjian Yu, Yuyang Xie
{"title":"svds-C: A Multi-Thread C Code for Computing Truncated Singular Value Decomposition","authors":"Xu Feng, Wenjian Yu, Yuyang Xie","doi":"arxiv-2405.18966","DOIUrl":"https://doi.org/arxiv-2405.18966","url":null,"abstract":"This article presents svds-C, an open-source and high-performance C program\u0000for accurately and robustly computing truncated SVD, e.g. computing several\u0000largest singular values and corresponding singular vectors. We have\u0000re-implemented the algorithm of svds in Matlab in C based on MKL or OpenBLAS\u0000and multi-thread computing to obtain the parallel program named svds-C. svds-C\u0000running on shared-memory computer consumes less time and memory than svds\u0000thanks to careful implementation of multi-thread parallelization and memory\u0000management. Numerical experiments on different test cases which are\u0000synthetically generated or directly from real world datasets show that, svds-C\u0000runs remarkably faster than svds with averagely 4.7X and at most 12X speedup\u0000for 16-thread parallel computing on a computer with Intel CPU, while preserving\u0000same accuracy and consuming about half memory space. Experimental results also\u0000demonstrate that svds-C has similar advantages over svds on the computer with\u0000AMD CPU, and outperforms other state-of-the-art algorithms for truncated SVD on\u0000computing time and robustness.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141195347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GridapTopOpt.jl: A scalable Julia toolbox for level set-based topology optimisation GridapTopOpt.jl:基于水平集拓扑优化的可扩展 Julia 工具箱
arXiv - CS - Mathematical Software Pub Date : 2024-05-17 DOI: arxiv-2405.10478
Zachary J. Wegert, Jordi Manyer, Connor Mallon, Santiago Badia, Vivien J. Challis
{"title":"GridapTopOpt.jl: A scalable Julia toolbox for level set-based topology optimisation","authors":"Zachary J. Wegert, Jordi Manyer, Connor Mallon, Santiago Badia, Vivien J. Challis","doi":"arxiv-2405.10478","DOIUrl":"https://doi.org/arxiv-2405.10478","url":null,"abstract":"In this paper we present GridapTopOpt, an extendable framework for level\u0000set-based topology optimisation that can be readily distributed across a\u0000personal computer or high-performance computing cluster. The package is written\u0000in Julia and uses the Gridap package ecosystem for parallel finite element\u0000assembly from arbitrary weak formulations of partial differential equation\u0000(PDEs) along with the scalable solvers from the Portable and Extendable Toolkit\u0000for Scientific Computing (PETSc). The resulting user interface is intuitive and\u0000easy-to-use, allowing for the implementation of a wide range of topology\u0000optimisation problems with a syntax that is near one-to-one with the\u0000mathematical notation. Furthermore, we implement automatic differentiation to\u0000help mitigate the bottleneck associated with the analytic derivation of\u0000sensitivities for complex problems. GridapTopOpt is capable of solving a range\u0000of benchmark and research topology optimisation problems with large numbers of\u0000degrees of freedom. This educational article demonstrates the usability and\u0000versatility of the package by describing the formulation and step-by-step\u0000implementation of several distinct topology optimisation problems. The driver\u0000scripts for these problems are provided and the package source code is\u0000available at https://github$.$com/zjwegert/GridapTopOpt.jl.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141146807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PyOptInterface: Design and implementation of an efficient modeling language for mathematical optimization PyOptInterface:设计和实现高效的数学优化建模语言
arXiv - CS - Mathematical Software Pub Date : 2024-05-16 DOI: arxiv-2405.10130
Yue Yang, Chenhui Lin, Luo Xu, Wenchuan Wu
{"title":"PyOptInterface: Design and implementation of an efficient modeling language for mathematical optimization","authors":"Yue Yang, Chenhui Lin, Luo Xu, Wenchuan Wu","doi":"arxiv-2405.10130","DOIUrl":"https://doi.org/arxiv-2405.10130","url":null,"abstract":"This paper introduces the design and implementation of PyOptInterface, a\u0000modeling language for mathematical optimization embedded in Python programming\u0000language. PyOptInterface uses lightweight and compact data structure to bridge\u0000high-level entities in optimization models like variables and constraints to\u0000internal indices of optimizers efficiently. It supports a variety of\u0000optimization solvers and a range of common problem classes. We provide\u0000benchmarks to exhibit the competitive performance of PyOptInterface compared\u0000with other state-of-the-art modeling languages.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local Adjoints for Simultaneous Preaccumulations with Shared Inputs 共享输入同时预积累的局部相邻关系
arXiv - CS - Mathematical Software Pub Date : 2024-05-13 DOI: arxiv-2405.07819
Johannes Blühdorn, Nicolas R. Gauger
{"title":"Local Adjoints for Simultaneous Preaccumulations with Shared Inputs","authors":"Johannes Blühdorn, Nicolas R. Gauger","doi":"arxiv-2405.07819","DOIUrl":"https://doi.org/arxiv-2405.07819","url":null,"abstract":"In shared-memory parallel automatic differentiation, shared inputs among\u0000simultaneous thread-local preaccumulations lead to data races if Jacobians are\u0000accumulated with a single, shared vector of adjoint variables. In this work, we\u0000discuss the benefits and tradeoffs of re-enabling such preaccumulations by a\u0000transition to suitable local adjoint variables. In particular, we assess the\u0000performance of mapped local adjoints in discrete adjoint computations in the\u0000multiphysics simulation suite SU2.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140937591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Parallel Discrete Adjoints in SU2 SU2 中的混合并行离散邻接
arXiv - CS - Mathematical Software Pub Date : 2024-05-09 DOI: arxiv-2405.06056
Johannes Blühdorn, Pedro Gomes, Max Aehle, Nicolas R. Gauger
{"title":"Hybrid Parallel Discrete Adjoints in SU2","authors":"Johannes Blühdorn, Pedro Gomes, Max Aehle, Nicolas R. Gauger","doi":"arxiv-2405.06056","DOIUrl":"https://doi.org/arxiv-2405.06056","url":null,"abstract":"The open-source multiphysics suite SU2 features discrete adjoints by means of\u0000operator overloading automatic differentiation (AD). While both primal and\u0000discrete adjoint solvers support MPI parallelism, hybrid parallelism using both\u0000MPI and OpenMP has only been introduced for the primal solvers so far. In this\u0000work, we enable hybrid parallel discrete adjoint solvers. Coupling SU2 with\u0000OpDiLib, an add-on for operator overloading AD tools that extends AD to OpenMP\u0000parallelism, marks a key step in this endeavour. We identify the affected parts\u0000of SU2's advanced AD workflow and discuss the required changes and their\u0000tradeoffs. Detailed performance studies compare MPI parallel and hybrid\u0000parallel discrete adjoints in terms of memory and runtime and unveil key\u0000performance characteristics. We showcase the effectiveness of performance\u0000optimizations and highlight perspectives for future improvements. At the same\u0000time, this study demonstrates the applicability of OpDiLib in a large code base\u0000and its scalability on large test cases, providing valuable insights for future\u0000applications both within and beyond SU2.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"208 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140937751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Sparse Tensor Generator with Efficient Feature Extraction 具有高效特征提取功能的稀疏张量生成器
arXiv - CS - Mathematical Software Pub Date : 2024-05-08 DOI: arxiv-2405.04944
Tugba Torun, Eren Yenigul, Ameer Taweel, Didem Unat
{"title":"A Sparse Tensor Generator with Efficient Feature Extraction","authors":"Tugba Torun, Eren Yenigul, Ameer Taweel, Didem Unat","doi":"arxiv-2405.04944","DOIUrl":"https://doi.org/arxiv-2405.04944","url":null,"abstract":"Sparse tensor operations are gaining attention in emerging applications such\u0000as social networks, deep learning, diagnosis, crime, and review analysis.\u0000However, a major obstacle for research in sparse tensor operations is the\u0000deficiency of a broad-scale sparse tensor dataset. Another challenge in sparse\u0000tensor operations is examining the sparse tensor features, which are not only\u0000important for revealing its nonzero pattern but also have a significant impact\u0000on determining the best-suited storage format, the decomposition algorithm, and\u0000the reordering methods. However, due to the large sizes of real tensors, even\u0000extracting these features becomes costly without caution. To address these gaps\u0000in the literature, we have developed a smart sparse tensor generator that\u0000mimics the substantial features of real sparse tensors. Moreover, we propose\u0000various methods for efficiently extracting an extensive set of features for\u0000sparse tensors. The effectiveness of our generator is validated through the\u0000quality of features and the performance of decomposition in the generated\u0000tensors. Both the sparse tensor feature extractor and the tensor generator are\u0000open source with all the artifacts available at\u0000https://github.com/sparcityeu/feaTen and https://github.com/sparcityeu/genTen,\u0000respectively.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140937567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of H-Matrix-Vector Multiplication with Floating Point Compression 使用浮点压缩的 H 矩阵-矢量乘法性能
arXiv - CS - Mathematical Software Pub Date : 2024-05-06 DOI: arxiv-2405.03456
Ronald Kriemann
{"title":"Performance of H-Matrix-Vector Multiplication with Floating Point Compression","authors":"Ronald Kriemann","doi":"arxiv-2405.03456","DOIUrl":"https://doi.org/arxiv-2405.03456","url":null,"abstract":"Matrix-vector multiplication forms the basis of many iterative solution\u0000algorithms and as such is an important algorithm also for hierarchical\u0000matrices. However, due to its low computational intensity, its performance is\u0000typically limited by the available memory bandwidth. By optimizing the storage\u0000representation of the data within such matrices, this limitation can be lifted\u0000and the performance increased. This applies not only to hierarchical matrices\u0000but for also for other low-rank approximation schemes, e.g. block low-rank\u0000matrices.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimization of Nonlinear Energies in Python Using FEM and Automatic Differentiation Tools 使用有限元和自动微分工具在 Python 中最小化非线性能量
arXiv - CS - Mathematical Software Pub Date : 2024-05-03 DOI: arxiv-2407.04706
Michal Béreš, Jan Valdman
{"title":"Minimization of Nonlinear Energies in Python Using FEM and Automatic Differentiation Tools","authors":"Michal Béreš, Jan Valdman","doi":"arxiv-2407.04706","DOIUrl":"https://doi.org/arxiv-2407.04706","url":null,"abstract":"This contribution examines the capabilities of the Python ecosystem to solve\u0000nonlinear energy minimization problems, with a particular focus on\u0000transitioning from traditional MATLAB methods to Python's advanced\u0000computational tools, such as automatic differentiation. We demonstrate Python's\u0000streamlined approach to minimizing nonlinear energies by analyzing three\u0000problem benchmarks - the p-Laplacian, the Ginzburg-Landau model, and the\u0000Neo-Hookean hyperelasticity. This approach merely requires the provision of the\u0000energy functional itself, making it a simple and efficient way to solve this\u0000category of problems. The results show that the implementation is about ten\u0000times faster than the MATLAB implementation for large-scale problems. Our\u0000findings highlight Python's efficiency and ease of use in scientific computing,\u0000establishing it as a preferable choice for implementing sophisticated\u0000mathematical models and accelerating the development of numerical simulations.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141571851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finch: Sparse and Structured Array Programming with Control Flow 芬奇稀疏和结构化数组编程与控制流
arXiv - CS - Mathematical Software Pub Date : 2024-04-25 DOI: arxiv-2404.16730
Willow Ahrens, Teodoro Fields Collin, Radha Patel, Kyle Deeds, Changwan Hong, Saman Amarasinghe
{"title":"Finch: Sparse and Structured Array Programming with Control Flow","authors":"Willow Ahrens, Teodoro Fields Collin, Radha Patel, Kyle Deeds, Changwan Hong, Saman Amarasinghe","doi":"arxiv-2404.16730","DOIUrl":"https://doi.org/arxiv-2404.16730","url":null,"abstract":"From FORTRAN to NumPy, arrays have revolutionized how we express computation.\u0000However, arrays in these, and almost all prominent systems, can only handle\u0000dense rectilinear integer grids. Real world arrays often contain underlying\u0000structure, such as sparsity, runs of repeated values, or symmetry. Support for\u0000structured data is fragmented and incomplete. Existing frameworks limit the\u0000array structures and program control flow they support to better simplify the\u0000problem. In this work, we propose a new programming language, Finch, which supports\u0000both flexible control flow and diverse data structures. Finch facilitates a\u0000programming model which resolves the challenges of computing over structured\u0000arrays by combining control flow and data structures into a common\u0000representation where they can be co-optimized. Finch automatically specializes\u0000control flow to data so that performance engineers can focus on experimenting\u0000with many algorithms. Finch supports a familiar programming language of loops,\u0000statements, ifs, breaks, etc., over a wide variety of array structures, such as\u0000sparsity, run-length-encoding, symmetry, triangles, padding, or blocks. Finch\u0000reliably utilizes the key properties of structure, such as structural zeros,\u0000repeated values, or clustered non-zeros. We show that this leads to dramatic\u0000speedups in operations such as SpMV and SpGEMM, image processing, graph\u0000analytics, and a high-level tensor operator fusion interface.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140800855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信