Python for Development of OpenMP and CUDA Kernels for Multidimensional Data

2011 Symposium on Application Accelerators in High-Performance Computing Pub Date : 2011-07-19 DOI:10.1109/SAAHPC.2011.26

B. Vacaliuc, D. Patlolla, E. D'Azevedo, G. Davidson, John K. Munro Jr, T. Evans, W. Joubert, Z. Bell

{"title":"Python for Development of OpenMP and CUDA Kernels for Multidimensional Data","authors":"B. Vacaliuc, D. Patlolla, E. D'Azevedo, G. Davidson, John K. Munro Jr, T. Evans, W. Joubert, Z. Bell","doi":"10.1109/SAAHPC.2011.26","DOIUrl":null,"url":null,"abstract":"Design of data structures for high performance computing (HPC) is one of the principal challenges facing researchers looking to utilize heterogeneous computing machinery. Heterogeneous systems derive cost, power, and speed efficiency by being composed of the appropriate hardware for the task. Yet, each type of processor requires a specific organization of the application state in order to achieve peak performance. Discovering this and refactoring the code can be a challenging and time-consuming task for the researcher, as the data structures and the computational model must be co-designed. We present a methodology that uses Python as the environment for which to explore tradeoffs in both the data structure design as well as the code executing on the computation accelerator. Our method enables multi-dimensional arrays to be used effectively in any target environment. We have chosen to focus on OpenMP and CUDA environments, thus exploring the development of optimized kernels for the two most common classes of computing hardware available today: multi-core CPU and GPU. Python's large palette of file and network access routines, its associative indexing syntax and support for common HPC environments makes it relevant for diverse hardware ranging from laptops through computing clusters to the highest performance supercomputers. Our work enables researchers to accelerate the development of their codes on the computing hardware of their choice.","PeriodicalId":331604,"journal":{"name":"2011 Symposium on Application Accelerators in High-Performance Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Symposium on Application Accelerators in High-Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAAHPC.2011.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Design of data structures for high performance computing (HPC) is one of the principal challenges facing researchers looking to utilize heterogeneous computing machinery. Heterogeneous systems derive cost, power, and speed efficiency by being composed of the appropriate hardware for the task. Yet, each type of processor requires a specific organization of the application state in order to achieve peak performance. Discovering this and refactoring the code can be a challenging and time-consuming task for the researcher, as the data structures and the computational model must be co-designed. We present a methodology that uses Python as the environment for which to explore tradeoffs in both the data structure design as well as the code executing on the computation accelerator. Our method enables multi-dimensional arrays to be used effectively in any target environment. We have chosen to focus on OpenMP and CUDA environments, thus exploring the development of optimized kernels for the two most common classes of computing hardware available today: multi-core CPU and GPU. Python's large palette of file and network access routines, its associative indexing syntax and support for common HPC environments makes it relevant for diverse hardware ranging from laptops through computing clusters to the highest performance supercomputers. Our work enables researchers to accelerate the development of their codes on the computing hardware of their choice.

查看原文本刊更多论文

用于多维数据的OpenMP和CUDA内核开发的Python

高性能计算(HPC)的数据结构设计是研究人员利用异构计算机器所面临的主要挑战之一。异构系统通过由适合任务的硬件组成来获得成本、功率和速度效率。然而，每种类型的处理器都需要对应用程序状态进行特定的组织，以便实现峰值性能。对于研究人员来说，发现这一点并重构代码可能是一项具有挑战性且耗时的任务，因为数据结构和计算模型必须共同设计。我们提出了一种使用Python作为环境的方法，用于探索数据结构设计以及在计算加速器上执行的代码的权衡。我们的方法可以在任何目标环境中有效地使用多维数组。我们选择专注于OpenMP和CUDA环境，从而为当今两种最常见的计算硬件(多核CPU和GPU)探索优化内核的开发。Python的大量文件和网络访问例程，其关联索引语法和对通用HPC环境的支持使其适用于从笔记本电脑到计算集群再到高性能超级计算机的各种硬件。我们的工作使研究人员能够在他们选择的计算硬件上加速他们的代码开发。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 Symposium on Application Accelerators in High-Performance Computing

自引率

0.00%

发文量