Abhishek Bagusetty, Ajay Panyala, Gavin Brown, Jack Kirk
{"title":"Towards Cross-Platform Portability of Coupled-Cluster Methods with Perturbative Triples using SYCL","authors":"Abhishek Bagusetty, Ajay Panyala, Gavin Brown, Jack Kirk","doi":"10.1109/P3HPC56579.2022.00013","DOIUrl":null,"url":null,"abstract":"Tensor contractions form the fundamental computational operation of computational chemistry, and these contractions dictate the performance of widely used coupled-cluster (CC) methods in computational chemistry. In this work, we study a single-source, cross-platform C++ abstraction layer programming model, SYCL, for applications related to the computational chemistry methods such as CCSD(T) coupled-cluster formalism. An existing optimized CUDA implementation was migrated to SYCL to make use of the novel algorithm that provides tractable GPU memory needs for solving high-dimensional tensor contractions for accelerating CCSD(T). We present the cross-platform performance achieved using SYCL implementations for the non-iterative triples contribution of the CCSD(T) formalism which is considered as the performance bottle neck on NVIDIA A100 and AMD Instinct MI250X. Additionally, we also draw comparisons of similar performance metrics from vendor-based native programming models such as CUDA and ROCm HIP. Our results indicate that the performance of SYCL measured at-scale was on-par with the code written in HIP for AMD MI250X GPUs while the performance is slightly lacking on NVIDIA A100 GPUs in comparison to CUDA.","PeriodicalId":261766,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/P3HPC56579.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Tensor contractions form the fundamental computational operation of computational chemistry, and these contractions dictate the performance of widely used coupled-cluster (CC) methods in computational chemistry. In this work, we study a single-source, cross-platform C++ abstraction layer programming model, SYCL, for applications related to the computational chemistry methods such as CCSD(T) coupled-cluster formalism. An existing optimized CUDA implementation was migrated to SYCL to make use of the novel algorithm that provides tractable GPU memory needs for solving high-dimensional tensor contractions for accelerating CCSD(T). We present the cross-platform performance achieved using SYCL implementations for the non-iterative triples contribution of the CCSD(T) formalism which is considered as the performance bottle neck on NVIDIA A100 and AMD Instinct MI250X. Additionally, we also draw comparisons of similar performance metrics from vendor-based native programming models such as CUDA and ROCm HIP. Our results indicate that the performance of SYCL measured at-scale was on-par with the code written in HIP for AMD MI250X GPUs while the performance is slightly lacking on NVIDIA A100 GPUs in comparison to CUDA.