Non-intrusive Performance Analysis of Parallel Hardware Accelerated Applications on Hybrid Architectures

2010 39th International Conference on Parallel Processing Workshops Pub Date : 2010-09-13 DOI:10.1109/ICPPW.2010.30

R. Dietrich, T. Ilsche, G. Juckeland

引用次数: 21

Abstract

New high performance computing (HPC) applications recently have to face scalability over an increasing number of nodes and the programming of special accelerator hardware. Hybrid composition of large computing systems leads to a new dimension in complexity of software development. This paper presents a novel approach to gain insight into accelerator interaction and utilization without any changes to the application. It leverages well established methods for performance analysis to accelerator hardware, allowing a holistic view on performance bottlenecks of hybrid applications. A general strategy is presented to get dynamic runtime information about hybrid program execution with minimal impact on the program ???ow. The achievable level of detail is exemplarily studied for the CUDA environment and the OpenCL framework. Combined with existing performance analysis techniques this facilitates obtaining the full potential of hybrid computing power.

查看原文本刊更多论文

混合架构下并行硬件加速应用的非侵入性性能分析

新的高性能计算(HPC)应用程序最近不得不面对在越来越多的节点上的可伸缩性和特殊加速器硬件的编程。大型计算系统的混合组合导致了软件开发复杂性的一个新维度。本文提出了一种在不改变应用程序的情况下深入了解加速器交互和利用的新方法。它利用完善的性能分析方法来加速硬件，从而全面了解混合应用程序的性能瓶颈。提出了一种获取混合程序执行动态运行时信息的通用策略，使其对程序的影响最小。在CUDA环境和OpenCL框架中对可实现的细节级别进行了举例研究。结合现有的性能分析技术，这有助于获得混合计算能力的全部潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 39th International Conference on Parallel Processing Workshops

自引率

0.00%

发文量