$O(N)$ distributed direct factorization of structured dense matrices using runtime systems

Sameer Deshmukh, Qinxiang Ma, Rio Yokota, George Bosilca
{"title":"$O(N)$ distributed direct factorization of structured dense matrices using runtime systems","authors":"Sameer Deshmukh, Qinxiang Ma, Rio Yokota, George Bosilca","doi":"arxiv-2311.00921","DOIUrl":null,"url":null,"abstract":"Structured dense matrices result from boundary integral problems in\nelectrostatics and geostatistics, and also Schur complements in sparse\npreconditioners such as multi-frontal methods. Exploiting the structure of such\nmatrices can reduce the time for dense direct factorization from $O(N^3)$ to\n$O(N)$. The Hierarchically Semi-Separable (HSS) matrix is one such low rank\nmatrix format that can be factorized using a Cholesky-like algorithm called ULV\nfactorization. The HSS-ULV algorithm is highly parallel because it removes the\ndependency on trailing sub-matrices at each HSS level. However, a key merge\nstep that links two successive HSS levels remains a challenge for efficient\nparallelization. In this paper, we use an asynchronous runtime system PaRSEC\nwith the HSS-ULV algorithm. We compare our work with STRUMPACK and LORAPO, both\nstate-of-the-art implementations of dense direct low rank factorization, and\nachieve up to 2x better factorization time for matrices arising from a diverse\nset of applications on up to 128 nodes of Fugaku for similar or better accuracy\nfor all the problems that we survey.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"13 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.00921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Structured dense matrices result from boundary integral problems in electrostatics and geostatistics, and also Schur complements in sparse preconditioners such as multi-frontal methods. Exploiting the structure of such matrices can reduce the time for dense direct factorization from $O(N^3)$ to $O(N)$. The Hierarchically Semi-Separable (HSS) matrix is one such low rank matrix format that can be factorized using a Cholesky-like algorithm called ULV factorization. The HSS-ULV algorithm is highly parallel because it removes the dependency on trailing sub-matrices at each HSS level. However, a key merge step that links two successive HSS levels remains a challenge for efficient parallelization. In this paper, we use an asynchronous runtime system PaRSEC with the HSS-ULV algorithm. We compare our work with STRUMPACK and LORAPO, both state-of-the-art implementations of dense direct low rank factorization, and achieve up to 2x better factorization time for matrices arising from a diverse set of applications on up to 128 nodes of Fugaku for similar or better accuracy for all the problems that we survey.
基于运行时系统的结构化密集矩阵的O(N)分布直接分解
结构密集矩阵来源于静电学和地统计学中的边界积分问题,也来源于稀疏预处理中的Schur互补,如多正面方法。利用这种矩阵的结构可以将密集直接分解的时间从$O(N^3)$减少到$O(N)$。层次半可分(HSS)矩阵就是这样一种低秩矩阵格式,可以使用称为ULVfactorization的类cholesky算法进行分解。HSS- ulv算法是高度并行的,因为它消除了对每个HSS级别的尾子矩阵的依赖。然而,连接两个连续HSS级别的关键合并步骤仍然是有效并行化的挑战。在本文中,我们使用了一个异步运行时系统parsecs与HSS-ULV算法。我们将我们的工作与STRUMPACK和LORAPO进行了比较,两者都是最先进的密集直接低秩分解实现,并且在多达128个Fugaku节点上对各种应用产生的矩阵实现了高达2倍的分解时间,对于我们调查的所有问题具有相似或更好的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信