容延迟分布稀疏三角解的一种新的数据映射方案

ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI:10.1109/SC.2002.10020

K. Teranishi, P. Raghavan, E. Ng

{"title":"容延迟分布稀疏三角解的一种新的数据映射方案","authors":"K. Teranishi, P. Raghavan, E. Ng","doi":"10.1109/SC.2002.10020","DOIUrl":null,"url":null,"abstract":"This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"A New Data-Mapping Scheme for Latency-Tolerant Distributed Sparse Triangular Solution\",\"authors\":\"K. Teranishi, P. Raghavan, E. Ng\",\"doi\":\"10.1109/SC.2002.10020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.\",\"PeriodicalId\":302800,\"journal\":{\"name\":\"ACM/IEEE SC 2002 Conference (SC'02)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM/IEEE SC 2002 Conference (SC'02)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.2002.10020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM/IEEE SC 2002 Conference (SC'02)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.2002.10020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

本文研究了分布式存储多处理机上稀疏三角形线性系统高效并行解的容延迟方案。当使用稀疏Cholesky因子求解右侧向量序列或使用不完全稀疏Cholesky因子作为共轭梯度迭代求解器的先决条件时，需要这样的三角形解。在这样的应用程序中，当处理器间通信的延迟很大时，使用传统的分布式替代方案可能会造成性能瓶颈。我们之前已经开发了选择性反演(SI)方案，通过并行矩阵向量乘法取代分布式替换来降低通信延迟成本。我们现在提出了一种新的三角稀疏矩阵到处理器的双向映射，通过将其通信延迟成本减半来提高SI的性能。我们给出了模型稀疏矩阵的解析结果，并报告了我们的方案在具有不完全稀疏Cholesky因子的并行预处理中的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A New Data-Mapping Scheme for Latency-Tolerant Distributed Sparse Triangular Solution

This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM/IEEE SC 2002 Conference (SC'02)

自引率

0.00%

发文量