基于改进节点间通信的多核集群系统Tile QR分解

2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI:10.1109/IPDPSW.2015.145

Tomohiro Suzuki

{"title":"基于改进节点间通信的多核集群系统Tile QR分解","authors":"Tomohiro Suzuki","doi":"10.1109/IPDPSW.2015.145","DOIUrl":null,"url":null,"abstract":"Tile algorithms for matrix decomposition can generate many fine-grained tasks. Therefore, their suitability for processing with multicourse architecture has attracted much attention from the high-performance computing (HPC) community. Our implementation of tile QR decomposition for a cluster system has dynamic scheduling, OpenMP work- sharing, and other useful features. In this article, we discuss the problems in internodes communications that were present in our previous implementation. The improved implementation has both strong and weak scalability.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems\",\"authors\":\"Tomohiro Suzuki\",\"doi\":\"10.1109/IPDPSW.2015.145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tile algorithms for matrix decomposition can generate many fine-grained tasks. Therefore, their suitability for processing with multicourse architecture has attracted much attention from the high-performance computing (HPC) community. Our implementation of tile QR decomposition for a cluster system has dynamic scheduling, OpenMP work- sharing, and other useful features. In this article, we discuss the problems in internodes communications that were present in our previous implementation. The improved implementation has both strong and weak scalability.\",\"PeriodicalId\":340697,\"journal\":{\"name\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2015.145\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2015.145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

矩阵分解的Tile算法可以生成许多细粒度的任务。因此，它们在多课程体系结构下的适用性引起了高性能计算界的广泛关注。我们为集群系统实现的tile QR分解具有动态调度、OpenMP工作共享和其他有用的特性。在本文中，我们将讨论在以前的实现中存在的节点间通信问题。改进后的实现具有强可伸缩性和弱可伸缩性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems

Tile algorithms for matrix decomposition can generate many fine-grained tasks. Therefore, their suitability for processing with multicourse architecture has attracted much attention from the high-performance computing (HPC) community. Our implementation of tile QR decomposition for a cluster system has dynamic scheduling, OpenMP work- sharing, and other useful features. In this article, we discuss the problems in internodes communications that were present in our previous implementation. The improved implementation has both strong and weak scalability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Parallel and Distributed Processing Symposium Workshop

自引率

0.00%

发文量