COMPOFF: A Compiler Cost model using Machine Learning to predict the Cost of OpenMP Offloading

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI:10.1109/IPDPSW55747.2022.00074

Alok Mishra, Smeet Chheda, Carlos Soto, A. Malik, Meifeng Lin, Barbara M. Chapman

{"title":"COMPOFF: A Compiler Cost model using Machine Learning to predict the Cost of OpenMP Offloading","authors":"Alok Mishra, Smeet Chheda, Carlos Soto, A. Malik, Meifeng Lin, Barbara M. Chapman","doi":"10.1109/IPDPSW55747.2022.00074","DOIUrl":null,"url":null,"abstract":"The HPC industry is inexorably moving towards an era of extremely heterogeneous architectures, with more devices configured on any given HPC platform and potentially more kinds of devices, some of them highly specialized. Writing a separate code suitable for each target system for a given HPC application is not practical. The better solution is to use directive-based parallel programming models such as OpenMP. OpenMP provides a number of options for offloading a piece of code to devices like GPUs. To select the best option from such options during compilation, most modern compilers use analytical models to estimate the cost of executing the original code and the different offloading code variants. Building such an analytical model for compilers is a difficult task that necessi-tates a lot of effort on the part of a compiler engineer. Recently, machine learning techniques have been successfully applied to build cost models for a variety of compiler optimization problems. In this paper, we present COMPOFF, a cost model that statically estimates the Cost of OpenMP OFFloading using a neural network model. We used six different transformations on a parallel code of Wilson Dslash Operator to support GPU offloading, and we predicted their cost of execution on different GPUs using COMPOFF during compile time. Our results show that this model can predict offloading costs with a root mean squared error in prediction of less than 0.5 seconds. Our preliminary findings indicate that this work will make it much easier and faster for scientists and compiler developers to port legacy HPC applications that use OpenMP to new heterogeneous computing environment.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The HPC industry is inexorably moving towards an era of extremely heterogeneous architectures, with more devices configured on any given HPC platform and potentially more kinds of devices, some of them highly specialized. Writing a separate code suitable for each target system for a given HPC application is not practical. The better solution is to use directive-based parallel programming models such as OpenMP. OpenMP provides a number of options for offloading a piece of code to devices like GPUs. To select the best option from such options during compilation, most modern compilers use analytical models to estimate the cost of executing the original code and the different offloading code variants. Building such an analytical model for compilers is a difficult task that necessi-tates a lot of effort on the part of a compiler engineer. Recently, machine learning techniques have been successfully applied to build cost models for a variety of compiler optimization problems. In this paper, we present COMPOFF, a cost model that statically estimates the Cost of OpenMP OFFloading using a neural network model. We used six different transformations on a parallel code of Wilson Dslash Operator to support GPU offloading, and we predicted their cost of execution on different GPUs using COMPOFF during compile time. Our results show that this model can predict offloading costs with a root mean squared error in prediction of less than 0.5 seconds. Our preliminary findings indicate that this work will make it much easier and faster for scientists and compiler developers to port legacy HPC applications that use OpenMP to new heterogeneous computing environment.

查看原文本刊更多论文

COMPOFF:一个使用机器学习来预测OpenMP卸载成本的编译器成本模型

HPC行业正在无情地走向一个极端异构架构的时代，在任何给定的HPC平台上配置更多的设备，并且可能有更多种类的设备，其中一些是高度专业化的。为给定的HPC应用程序编写适合每个目标系统的单独代码是不切实际的。更好的解决方案是使用基于指令的并行编程模型，如OpenMP。OpenMP提供了许多将代码片段卸载到gpu等设备的选项。为了在编译期间从这些选项中选择最佳选项，大多数现代编译器使用分析模型来估计执行原始代码和不同卸载代码变体的成本。为编译器构建这样的分析模型是一项艰巨的任务，需要编译器工程师付出大量的努力。近年来，机器学习技术已成功地应用于构建各种编译器优化问题的成本模型。在本文中，我们提出了COMPOFF，这是一个使用神经网络模型静态估计OpenMP卸载成本的成本模型。我们在Wilson Dslash算子的并行代码上使用了六种不同的转换来支持GPU卸载，并在编译期间使用COMPOFF预测了它们在不同GPU上的执行成本。我们的结果表明，该模型可以预测卸载成本，预测均方根误差小于0.5秒。我们的初步研究结果表明，这项工作将使科学家和编译器开发人员更容易、更快地将使用OpenMP的传统HPC应用程序移植到新的异构计算环境中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

自引率

0.00%

发文量