Proceedings of the 2023 International Workshop on OpenCL最新文献

筛选

英文中文

Profile Guided Optimization Transfer-Learning for OpenCL/SYCL Kernel Compilation and Runtime 面向OpenCL/SYCL内核编译和运行时的配置文件引导优化迁移学习

Proceedings of the 2023 International Workshop on OpenCL Pub Date : 2023-04-18 DOI: 10.1145/3585341.3585359

Wenju He, Maosu Zhao, Yuxin Zou, Feng Zou

引用次数: 0

Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL 恒星合并与hx - kokkos和SYCL:使用异步多任务运行时系统与SYCL的方法

Proceedings of the 2023 International Workshop on OpenCL Pub Date : 2023-03-04 DOI: 10.1145/3585341.3585354

Gregor Daiß, Patrick Diehl, H. Kaiser, D. Pflüger

{"title":"Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL","authors":"Gregor Daiß, Patrick Diehl, H. Kaiser, D. Pflüger","doi":"10.1145/3585341.3585354","DOIUrl":"https://doi.org/10.1145/3585341.3585354","url":null,"abstract":"Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogeneity of available accelerator cards within current supercomputers, portability is a key aspect for modern HPC applications. In Octo-Tiger, an astrophysics application simulating binary star systems and stellar mergers, we rely on Kokkos and its various execution spaces for portable compute kernels. In turn, we use HPX, a distributed task-based runtime system, to coordinate kernel launches, CPU tasks, and communication. This combination allows us to have a fine interleaving between portable CPU/GPU computations and communication, enabling scalability on various supercomputers. However, for HPX and Kokkos to work together optimally, we need to be able to treat Kokkos kernels as HPX tasks. Otherwise, instead of integrating asynchronous Kokkos kernel launches into HPX’s task graph, we would have to actively wait for them with fence commands, which wastes CPU time better spent otherwise. Using an integration layer called HPX-Kokkos, treating Kokkos kernels as tasks already works for some Kokkos execution spaces (like the CUDA one), but not for others (like the SYCL one). In this work, we started making Octo-Tiger and HPX itself compatible with SYCL. To do so, we introduce numerous software changes most notably an HPX-SYCL integration. This integration allows us to treat SYCL events as HPX tasks, which in turn allows us to better integrate Kokkos by extending the support of HPX-Kokkos to also fully support Kokkos’ SYCL execution space. We show two ways to implement this HPX-SYCL integration and test them using Octo-Tiger and its Kokkos kernels, on both an NVIDIA A100 and an AMD MI100. We find modest, yet noticeable, speedups (1.11x to 1.15x for the relevant configurations) by enabling this integration, even when just running simple single-node scenarios with Octo-Tiger where communication and CPU utilization are not yet an issue. We further find that the integration using event polling within the HPX scheduler works far better than the alternative implementation using SYCL host tasks.","PeriodicalId":360830,"journal":{"name":"Proceedings of the 2023 International Workshop on OpenCL","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115359508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Proceedings of the 2023 International Workshop on OpenCL 2023年OpenCL国际研讨会论文集

Proceedings of the 2023 International Workshop on OpenCL Pub Date : 1900-01-01 DOI: 10.1145/3585341

引用次数: 0

首页上一页