Proceedings of the 2023 International Workshop on OpenCL最新文献

筛选
英文 中文
Profile Guided Optimization Transfer-Learning for OpenCL/SYCL Kernel Compilation and Runtime 面向OpenCL/SYCL内核编译和运行时的配置文件引导优化迁移学习
Proceedings of the 2023 International Workshop on OpenCL Pub Date : 2023-04-18 DOI: 10.1145/3585341.3585359
Wenju He, Maosu Zhao, Yuxin Zou, Feng Zou
{"title":"Profile Guided Optimization Transfer-Learning for OpenCL/SYCL Kernel Compilation and Runtime","authors":"Wenju He, Maosu Zhao, Yuxin Zou, Feng Zou","doi":"10.1145/3585341.3585359","DOIUrl":"https://doi.org/10.1145/3585341.3585359","url":null,"abstract":"Reducing SYCL kernel compilation time and overhead of runtime are important topics for heterogeneous computing performance. Profile-Guided Optimization (PGO) is an optimization technique widely used in compiler to better optimize code. We apply PGO to both SYCL kernel compilation and backend runtime. The first experiment demonstrates transfer-learning that profiling data collected from SPEC CPU® 2006 benchmark can benefit kernel compilation on OpenCL/SYCL benchmarks. The second experiment also demonstrates transfer-learning that profiling data collected from some OpenCL/SYCL benchmarks could be used to reduce CPU backend runtime overhead in unseen benchmarks.","PeriodicalId":360830,"journal":{"name":"Proceedings of the 2023 International Workshop on OpenCL","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123328907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL 恒星合并与hx - kokkos和SYCL:使用异步多任务运行时系统与SYCL的方法
Proceedings of the 2023 International Workshop on OpenCL Pub Date : 2023-03-04 DOI: 10.1145/3585341.3585354
Gregor Daiß, Patrick Diehl, H. Kaiser, D. Pflüger
{"title":"Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL","authors":"Gregor Daiß, Patrick Diehl, H. Kaiser, D. Pflüger","doi":"10.1145/3585341.3585354","DOIUrl":"https://doi.org/10.1145/3585341.3585354","url":null,"abstract":"Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogeneity of available accelerator cards within current supercomputers, portability is a key aspect for modern HPC applications. In Octo-Tiger, an astrophysics application simulating binary star systems and stellar mergers, we rely on Kokkos and its various execution spaces for portable compute kernels. In turn, we use HPX, a distributed task-based runtime system, to coordinate kernel launches, CPU tasks, and communication. This combination allows us to have a fine interleaving between portable CPU/GPU computations and communication, enabling scalability on various supercomputers. However, for HPX and Kokkos to work together optimally, we need to be able to treat Kokkos kernels as HPX tasks. Otherwise, instead of integrating asynchronous Kokkos kernel launches into HPX’s task graph, we would have to actively wait for them with fence commands, which wastes CPU time better spent otherwise. Using an integration layer called HPX-Kokkos, treating Kokkos kernels as tasks already works for some Kokkos execution spaces (like the CUDA one), but not for others (like the SYCL one). In this work, we started making Octo-Tiger and HPX itself compatible with SYCL. To do so, we introduce numerous software changes most notably an HPX-SYCL integration. This integration allows us to treat SYCL events as HPX tasks, which in turn allows us to better integrate Kokkos by extending the support of HPX-Kokkos to also fully support Kokkos’ SYCL execution space. We show two ways to implement this HPX-SYCL integration and test them using Octo-Tiger and its Kokkos kernels, on both an NVIDIA A100 and an AMD MI100. We find modest, yet noticeable, speedups (1.11x to 1.15x for the relevant configurations) by enabling this integration, even when just running simple single-node scenarios with Octo-Tiger where communication and CPU utilization are not yet an issue. We further find that the integration using event polling within the HPX scheduler works far better than the alternative implementation using SYCL host tasks.","PeriodicalId":360830,"journal":{"name":"Proceedings of the 2023 International Workshop on OpenCL","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115359508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Proceedings of the 2023 International Workshop on OpenCL 2023年OpenCL国际研讨会论文集
Proceedings of the 2023 International Workshop on OpenCL Pub Date : 1900-01-01 DOI: 10.1145/3585341
{"title":"Proceedings of the 2023 International Workshop on OpenCL","authors":"","doi":"10.1145/3585341","DOIUrl":"https://doi.org/10.1145/3585341","url":null,"abstract":"","PeriodicalId":360830,"journal":{"name":"Proceedings of the 2023 International Workshop on OpenCL","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132802411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信