Performance Evolution of Different SYCL Implementations based on the Parallel Least Squares Support Vector Machine Library

Marcel Breyer, Alexander Van Craen, D. Pflüger
{"title":"Performance Evolution of Different SYCL Implementations based on the Parallel Least Squares Support Vector Machine Library","authors":"Marcel Breyer, Alexander Van Craen, D. Pflüger","doi":"10.1145/3585341.3585369","DOIUrl":null,"url":null,"abstract":"In machine learning and scientific computing, some of the biggest challenges are efficient and performant portable computing. With our Parallel Least Squares Support Vector Machine (PLSSVM) library, we have not only developed an unrivaled Support Vector Machine (SVM) implementation for huge dense data sets, but we have also created a representative benchmark for a frequently encountered task in scientific computing, a (implicit) matrix-vector multiplication. PLSSVM supports multiple backends—OpenMP, CUDA, HIP, OpenCL, and SYCL—to be able to target the most widely used hardware platforms in machine learning and scientific computing. In this paper, we use PLSSVM to compare different DPC++ and Open SYCL (formerly known as hipSYCL) versions over the period of one year. Furthermore, we compared two versions (one from February and the other from November 2022) with each other and report their respective performance evolution in depth. We also put these results in relation to our other implemented backends and report their performance portability on three different hardware platforms, an NVIDIA and AMD GPU and an Intel CPU. Our results show that installing new DPC++ and Open SYCL versions can have surprisingly vast impacts in both directions. In our case, the nd_range kernel runtimes were up to faster on an NVIDIA GPU when using a newer DPC++ compiler. Also for Open SYCL, using the new omp.accelerated compilation flow improves the nd_range performance on CPUs by over . When compared to OpenCL, in our results, SYCL also offers a better performance portability while being easier to use, indicated by drastically fewer lines of code needed in our PLSSVM library. While OpenCL only has a performance portability of , DPC++ achieved the highest value with within the performance metric provided by Pennycook et al. [23]. The code, utility scripts, and documentation are all publicly available on GitHub: https://github.com/SC-SGS/PLSSVM.","PeriodicalId":360830,"journal":{"name":"Proceedings of the 2023 International Workshop on OpenCL","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 International Workshop on OpenCL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3585341.3585369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In machine learning and scientific computing, some of the biggest challenges are efficient and performant portable computing. With our Parallel Least Squares Support Vector Machine (PLSSVM) library, we have not only developed an unrivaled Support Vector Machine (SVM) implementation for huge dense data sets, but we have also created a representative benchmark for a frequently encountered task in scientific computing, a (implicit) matrix-vector multiplication. PLSSVM supports multiple backends—OpenMP, CUDA, HIP, OpenCL, and SYCL—to be able to target the most widely used hardware platforms in machine learning and scientific computing. In this paper, we use PLSSVM to compare different DPC++ and Open SYCL (formerly known as hipSYCL) versions over the period of one year. Furthermore, we compared two versions (one from February and the other from November 2022) with each other and report their respective performance evolution in depth. We also put these results in relation to our other implemented backends and report their performance portability on three different hardware platforms, an NVIDIA and AMD GPU and an Intel CPU. Our results show that installing new DPC++ and Open SYCL versions can have surprisingly vast impacts in both directions. In our case, the nd_range kernel runtimes were up to faster on an NVIDIA GPU when using a newer DPC++ compiler. Also for Open SYCL, using the new omp.accelerated compilation flow improves the nd_range performance on CPUs by over . When compared to OpenCL, in our results, SYCL also offers a better performance portability while being easier to use, indicated by drastically fewer lines of code needed in our PLSSVM library. While OpenCL only has a performance portability of , DPC++ achieved the highest value with within the performance metric provided by Pennycook et al. [23]. The code, utility scripts, and documentation are all publicly available on GitHub: https://github.com/SC-SGS/PLSSVM.
基于并行最小二乘支持向量机库的不同SYCL实现的性能演化
在机器学习和科学计算中,一些最大的挑战是高效和高性能的便携式计算。通过我们的并行最小二乘支持向量机(PLSSVM)库,我们不仅为巨大的密集数据集开发了无与伦比的支持向量机(SVM)实现,而且我们还为科学计算中经常遇到的任务(隐式)矩阵向量乘法创建了一个代表性基准。PLSSVM支持多个后端——openmp、CUDA、HIP、OpenCL和sycl——能够针对机器学习和科学计算中最广泛使用的硬件平台。在本文中,我们使用PLSSVM在一年的时间内比较不同的dpc++和Open SYCL(以前称为hipSYCL)版本。此外,我们还比较了两个版本(一个来自2022年2月,另一个来自2022年11月),并深入报告了它们各自的性能演变。我们还将这些结果与其他实现的后端进行比较,并报告它们在三种不同硬件平台(NVIDIA和AMD GPU以及Intel CPU)上的性能可移植性。我们的结果表明,安装新的dpc++和Open SYCL版本可以在两个方向上产生惊人的巨大影响。在我们的例子中,当使用较新的dpc++编译器时,nd_range内核运行时在NVIDIA GPU上运行得更快。同样适用于Open SYCL,使用新的omp。加速编译流将cpu上的nd_range性能提高了一半。与OpenCL相比,在我们的结果中,SYCL还提供了更好的性能可移植性,同时更容易使用,这表明我们的PLSSVM库中所需的代码行数大大减少。OpenCL的性能可移植性仅为,而dpc++在Pennycook等人[23]提供的性能指标中达到了最高值。代码、实用程序脚本和文档都可以在GitHub上公开获得:https://github.com/SC-SGS/PLSSVM。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信