GPU-based parallel householder bidiagonalization

IEEE International Symposium on High-Performance Parallel Distributed Computing Pub Date : 2010-06-21 DOI:10.1145/1851476.1851512

Fangbing Liu, F. Seinstra

引用次数: 4

Abstract

In this paper, we discuss the GPU-based implementation and optimization of Householder bidiagonalization, a matrix factorization method which is an integral part of full Singular Value Decomposition (SVD) - an important algorithm for many problems in the research domain of Multimedia Content Analysis (MMCA). On cluster computers, complex adaptive run-time techniques often must be implemented to overcome the growing negative performance impact of load imbalances and to ensure reasonable speedup. We show that the nature of the many-core platform can avoid the necessity of applying such complex run-time parallelization techniques in software while achieving a performance of 64 gigaflops/s on a single-GPU GTX 295 in double precision, 82% of the theoretical peak performance.

查看原文本刊更多论文

基于gpu的并行户主双对角化

本文讨论了基于gpu的Householder双对角化的实现和优化。Householder双对角化是全奇异值分解(SVD)的一个组成部分，是多媒体内容分析(MMCA)研究领域中许多问题的重要算法。在集群计算机上，通常必须实现复杂的自适应运行时技术，以克服负载不平衡对性能日益增长的负面影响，并确保合理的加速。我们表明，多核平台的性质可以避免在软件中应用这种复杂的运行时并行化技术的必要性，同时在单gpu GTX 295上实现64千兆次/秒的双精度性能，达到理论峰值性能的82%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Symposium on High-Performance Parallel Distributed Computing

自引率

0.00%

发文量