Jetson Nano平台的OpenMP卸载

Ilias K. Kasmeridis, V. Dimakopoulos
{"title":"Jetson Nano平台的OpenMP卸载","authors":"Ilias K. Kasmeridis, V. Dimakopoulos","doi":"10.1145/3547276.3548517","DOIUrl":null,"url":null,"abstract":"The nvidia Jetson Nano is a very popular system-on-module and developer kit which brings high-performance specs in a small and power-efficient embedded platform. Integrating a 128-core gpu and a quad-core cpu, it provides enough capabilities to support computationally demanding applications such as AI inference, deep learning and computer vision. While the Jetson Nano family supports a number of apis and libraries out of the box, comprehensive support of OpenMP, one of the most popular apis, is not readily available. In this work we present the implementation of an OpenMP infrastructure that is able to harness both the cpu and the gpu of a Jetson Nano board using the offload facilities of the recent versions of the OpenMP specifications. We discuss the compiler-side transformations of key constructs, the generation of cuda-based code as well as how the runtime support is provided. We also provide experimental results for a number of applications, exhibiting performance comparable with their pure cuda versions.","PeriodicalId":255540,"journal":{"name":"Workshop Proceedings of the 51st International Conference on Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OpenMP Offloading in the Jetson Nano Platform\",\"authors\":\"Ilias K. Kasmeridis, V. Dimakopoulos\",\"doi\":\"10.1145/3547276.3548517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The nvidia Jetson Nano is a very popular system-on-module and developer kit which brings high-performance specs in a small and power-efficient embedded platform. Integrating a 128-core gpu and a quad-core cpu, it provides enough capabilities to support computationally demanding applications such as AI inference, deep learning and computer vision. While the Jetson Nano family supports a number of apis and libraries out of the box, comprehensive support of OpenMP, one of the most popular apis, is not readily available. In this work we present the implementation of an OpenMP infrastructure that is able to harness both the cpu and the gpu of a Jetson Nano board using the offload facilities of the recent versions of the OpenMP specifications. We discuss the compiler-side transformations of key constructs, the generation of cuda-based code as well as how the runtime support is provided. We also provide experimental results for a number of applications, exhibiting performance comparable with their pure cuda versions.\",\"PeriodicalId\":255540,\"journal\":{\"name\":\"Workshop Proceedings of the 51st International Conference on Parallel Processing\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop Proceedings of the 51st International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3547276.3548517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3547276.3548517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

nvidia Jetson Nano是一个非常受欢迎的系统模块和开发工具包,它在一个小而节能的嵌入式平台上带来了高性能的规格。它集成了128核gpu和四核cpu,提供了足够的能力来支持人工智能推理、深度学习和计算机视觉等计算要求很高的应用程序。虽然Jetson Nano系列支持许多开箱即用的api和库,但对最流行的api之一OpenMP的全面支持并不容易获得。在这项工作中,我们提出了一个OpenMP基础设施的实现,该基础设施能够利用Jetson Nano板的cpu和gpu,使用最新版本的OpenMP规范的卸载设施。我们将讨论关键结构的编译器端转换、基于cuda的代码的生成以及如何提供运行时支持。我们还提供了一些应用程序的实验结果,显示出与纯cuda版本相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
OpenMP Offloading in the Jetson Nano Platform
The nvidia Jetson Nano is a very popular system-on-module and developer kit which brings high-performance specs in a small and power-efficient embedded platform. Integrating a 128-core gpu and a quad-core cpu, it provides enough capabilities to support computationally demanding applications such as AI inference, deep learning and computer vision. While the Jetson Nano family supports a number of apis and libraries out of the box, comprehensive support of OpenMP, one of the most popular apis, is not readily available. In this work we present the implementation of an OpenMP infrastructure that is able to harness both the cpu and the gpu of a Jetson Nano board using the offload facilities of the recent versions of the OpenMP specifications. We discuss the compiler-side transformations of key constructs, the generation of cuda-based code as well as how the runtime support is provided. We also provide experimental results for a number of applications, exhibiting performance comparable with their pure cuda versions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信