CholeskyQR2:一种简单且避免通信的大规模并行系统高瘦QR分解算法

Takeshi Fukaya, Y. Nakatsukasa, Yuka Yanagisawa, Yusaku Yamamoto
{"title":"CholeskyQR2:一种简单且避免通信的大规模并行系统高瘦QR分解算法","authors":"Takeshi Fukaya, Y. Nakatsukasa, Yuka Yanagisawa, Yusaku Yamamoto","doi":"10.1109/ScalA.2014.11","DOIUrl":null,"url":null,"abstract":"Designing communication-avoiding algorithms is crucial for high performance computing on a large-scale parallel system. The TSQR algorithm is a communication-avoiding algorithm for computing a tall-skinny QR factorization, and TSQR is known to be much faster and as stable as the classical Householder QR algorithm. The Cholesky QR algorithm is another very simple and fast communication-avoiding algorithm, but rarely used in practice because of its numerical instability. Our recent work points out that an algorithm that simply repeats Cholesky QR twice, which we call CholeskyQR2, gives excellent accuracy for a wide range of matrices arising in practice. Although the communication cost of CholeskyQR2 is twice that of TSQR, it has an advantage that its reduction operation is addition whereas that of TSQR is a QR factorization, whose high-performance implementation is more difficult. Thus, CholeskyQR2 can potentially be significantly faster than TSQR. Indeed, in our experiments using 16384 nodes of the K computer, CholeskyQR2 ran about three times faster than TSQR for a 4194304 × 64 matrix.","PeriodicalId":323689,"journal":{"name":"2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"CholeskyQR2: A Simple and Communication-Avoiding Algorithm for Computing a Tall-Skinny QR Factorization on a Large-Scale Parallel System\",\"authors\":\"Takeshi Fukaya, Y. Nakatsukasa, Yuka Yanagisawa, Yusaku Yamamoto\",\"doi\":\"10.1109/ScalA.2014.11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Designing communication-avoiding algorithms is crucial for high performance computing on a large-scale parallel system. The TSQR algorithm is a communication-avoiding algorithm for computing a tall-skinny QR factorization, and TSQR is known to be much faster and as stable as the classical Householder QR algorithm. The Cholesky QR algorithm is another very simple and fast communication-avoiding algorithm, but rarely used in practice because of its numerical instability. Our recent work points out that an algorithm that simply repeats Cholesky QR twice, which we call CholeskyQR2, gives excellent accuracy for a wide range of matrices arising in practice. Although the communication cost of CholeskyQR2 is twice that of TSQR, it has an advantage that its reduction operation is addition whereas that of TSQR is a QR factorization, whose high-performance implementation is more difficult. Thus, CholeskyQR2 can potentially be significantly faster than TSQR. Indeed, in our experiments using 16384 nodes of the K computer, CholeskyQR2 ran about three times faster than TSQR for a 4194304 × 64 matrix.\",\"PeriodicalId\":323689,\"journal\":{\"name\":\"2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ScalA.2014.11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ScalA.2014.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 31

摘要

避免通信算法的设计对于大规模并行系统的高性能计算至关重要。TSQR算法是一种避免通信的算法,用于计算高瘦QR分解,TSQR算法比传统的Householder QR算法更快,更稳定。Cholesky QR算法是另一种非常简单和快速的通信避免算法,但由于其数值不稳定性,在实际中很少使用。我们最近的工作指出,一种简单地重复两次CholeskyQR的算法,我们称之为CholeskyQR2,对于实践中出现的各种矩阵具有出色的准确性。虽然CholeskyQR2的通信成本是TSQR的两倍,但它的优点是它的约简操作是加法,而TSQR的约简操作是QR分解,其高性能实现难度较大。因此,CholeskyQR2可能比TSQR快得多。事实上,在我们使用K计算机的16384个节点的实验中,对于4194304 × 64矩阵,CholeskyQR2的运行速度比TSQR快三倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CholeskyQR2: A Simple and Communication-Avoiding Algorithm for Computing a Tall-Skinny QR Factorization on a Large-Scale Parallel System
Designing communication-avoiding algorithms is crucial for high performance computing on a large-scale parallel system. The TSQR algorithm is a communication-avoiding algorithm for computing a tall-skinny QR factorization, and TSQR is known to be much faster and as stable as the classical Householder QR algorithm. The Cholesky QR algorithm is another very simple and fast communication-avoiding algorithm, but rarely used in practice because of its numerical instability. Our recent work points out that an algorithm that simply repeats Cholesky QR twice, which we call CholeskyQR2, gives excellent accuracy for a wide range of matrices arising in practice. Although the communication cost of CholeskyQR2 is twice that of TSQR, it has an advantage that its reduction operation is addition whereas that of TSQR is a QR factorization, whose high-performance implementation is more difficult. Thus, CholeskyQR2 can potentially be significantly faster than TSQR. Indeed, in our experiments using 16384 nodes of the K computer, CholeskyQR2 ran about three times faster than TSQR for a 4194304 × 64 matrix.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信