Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU Tasks

2019 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2019-11-01 DOI:10.1109/IISWC47752.2019.9042134

A. Kaushik, Ashwin M. Aji, M. A. Hassaan, N. Chalmers, Noah Wolfe, Scott Moe, Sooraj Puthoor, Bradford M. Beckmann

引用次数: 2

Abstract

General-Purpose Graphics Processing Units (GPGPUs) are employed in today's fastest supercomputers to accelerate a variety of scientific compute workloads. These workloads typically comprise of data-parallel mathematical kernels that are well suited for execution on GPUs. The hyperplane sweep operation is one such mathematical kernel that is commonly used in high-performance computing. In this paper, we characterize the conventional bulk synchronous hyperplane sweep implementation currently used by GPUs and identify significant performance improvement potential by breaking the operation into finer-grain tasks. Guided by this characterization, we propose multi-grain task decomposition and scheduling techniques to optimize the operation. We use KRIPKE as a case study that features the sweep operation, and we show that our proposed optimizations achieve 41% speedup over the bulk synchronous implementation.

查看原文本刊更多论文

使用异步多粒GPU任务优化超平面扫描操作

通用图形处理单元(gpgpu)用于当今最快的超级计算机，以加速各种科学计算工作负载。这些工作负载通常由非常适合在gpu上执行的数据并行数学内核组成。超平面扫描操作就是一种通常用于高性能计算的数学内核。在本文中，我们描述了当前gpu使用的传统批量同步超平面扫描实现，并通过将操作分解为更细粒度的任务来识别显著的性能改进潜力。在此特征的指导下，我们提出了多粒度任务分解和调度技术来优化操作。我们使用KRIPKE作为一个具有扫描操作的案例研究，我们表明，我们提出的优化比批量同步实现的速度提高了41%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Symposium on Workload Characterization (IISWC)

自引率

0.00%

发文量