分析SYCL跨cpu、gpu和具有SW序列对齐的混合系统的性能可移植性

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-04-05 DOI:10.1016/j.future.2025.107838

Manuel Costanzo , Enzo Rucci , Carlos García-Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

{"title":"分析SYCL跨cpu、gpu和具有SW序列对齐的混合系统的性能可移植性","authors":"Manuel Costanzo , Enzo Rucci , Carlos García-Sánchez , Marcelo Naiouf , Manuel Prieto-Matías","doi":"10.1016/j.future.2025.107838","DOIUrl":null,"url":null,"abstract":"<div><div>The high-performance computing (HPC) landscape is undergoing rapid transformation, with an increasing emphasis on energy-efficient and heterogeneous computing environments. This comprehensive study extends our previous research on SYCL’s performance portability by evaluating its effectiveness across a broader spectrum of computing architectures, including CPUs, GPUs, and hybrid CPU–GPU configurations from NVIDIA, Intel, and AMD. Our analysis covers single-GPU, multi-GPU, single-CPU, and CPU–GPU hybrid setups, using two common, bioinformatic applications as a case study. The results demonstrate SYCL’s versatility across different architectures, maintaining comparable performance to CUDA on NVIDIA GPUs while achieving similar architectural efficiency rates on AMD and Intel GPUs in the majority of cases tested. SYCL also demonstrated remarkable versatility and effectiveness across CPUs from various manufacturers, including the latest hybrid architectures from Intel. Although SYCL showed excellent functional portability in hybrid CPU–GPU configurations, performance varied significantly based on specific hardware combinations. Some performance limitations were identified in multi-GPU and CPU–GPU configurations, primarily attributed to workload distribution strategies rather than SYCL-specific constraints. These findings position SYCL as a promising unified programming model for heterogeneous computing environments, particularly for bioinformatic applications.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"170 ","pages":"Article 107838"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analyzing the performance portability of SYCL across CPUs, GPUs, and hybrid systems with SW sequence alignment\",\"authors\":\"Manuel Costanzo , Enzo Rucci , Carlos García-Sánchez , Marcelo Naiouf , Manuel Prieto-Matías\",\"doi\":\"10.1016/j.future.2025.107838\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The high-performance computing (HPC) landscape is undergoing rapid transformation, with an increasing emphasis on energy-efficient and heterogeneous computing environments. This comprehensive study extends our previous research on SYCL’s performance portability by evaluating its effectiveness across a broader spectrum of computing architectures, including CPUs, GPUs, and hybrid CPU–GPU configurations from NVIDIA, Intel, and AMD. Our analysis covers single-GPU, multi-GPU, single-CPU, and CPU–GPU hybrid setups, using two common, bioinformatic applications as a case study. The results demonstrate SYCL’s versatility across different architectures, maintaining comparable performance to CUDA on NVIDIA GPUs while achieving similar architectural efficiency rates on AMD and Intel GPUs in the majority of cases tested. SYCL also demonstrated remarkable versatility and effectiveness across CPUs from various manufacturers, including the latest hybrid architectures from Intel. Although SYCL showed excellent functional portability in hybrid CPU–GPU configurations, performance varied significantly based on specific hardware combinations. Some performance limitations were identified in multi-GPU and CPU–GPU configurations, primarily attributed to workload distribution strategies rather than SYCL-specific constraints. These findings position SYCL as a promising unified programming model for heterogeneous computing environments, particularly for bioinformatic applications.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"170 \",\"pages\":\"Article 107838\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25001335\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001335","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

高性能计算（HPC）领域正在经历快速变革，人们越来越重视高能效和异构计算环境。这项综合研究扩展了我们之前对 SYCL 性能可移植性的研究，评估了 SYCL 在更广泛的计算架构中的有效性，包括英伟达™（NVIDIA®）、英特尔™（Intel®）和 AMD™ 的 CPU、GPU 以及 CPU-GPU 混合配置。我们的分析涵盖了单 GPU、多 GPU、单 CPU 和 CPU-GPU 混合设置，并以两个常见的生物信息学应用为案例进行了研究。结果表明，SYCL 在不同的架构上具有多功能性，在大多数测试案例中，SYCL 在英伟达™（NVIDIA®） GPU 上的性能与 CUDA 不相上下，而在 AMD 和英特尔 GPU 上则达到了类似的架构效率。此外，SYCL 还在不同制造商的 CPU（包括英特尔最新的混合架构）上表现出卓越的通用性和有效性。尽管 SYCL 在 CPU 和 GPU 混合配置中表现出了出色的功能可移植性，但具体硬件组合的性能差异很大。在多 GPU 和 CPU-GPU 配置中发现了一些性能限制，主要归因于工作负载分配策略，而不是 SYCL 的特定限制。这些发现将SYCL定位为异构计算环境中一种前景广阔的统一编程模型，尤其适用于生物信息学应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analyzing the performance portability of SYCL across CPUs, GPUs, and hybrid systems with SW sequence alignment

The high-performance computing (HPC) landscape is undergoing rapid transformation, with an increasing emphasis on energy-efficient and heterogeneous computing environments. This comprehensive study extends our previous research on SYCL’s performance portability by evaluating its effectiveness across a broader spectrum of computing architectures, including CPUs, GPUs, and hybrid CPU–GPU configurations from NVIDIA, Intel, and AMD. Our analysis covers single-GPU, multi-GPU, single-CPU, and CPU–GPU hybrid setups, using two common, bioinformatic applications as a case study. The results demonstrate SYCL’s versatility across different architectures, maintaining comparable performance to CUDA on NVIDIA GPUs while achieving similar architectural efficiency rates on AMD and Intel GPUs in the majority of cases tested. SYCL also demonstrated remarkable versatility and effectiveness across CPUs from various manufacturers, including the latest hybrid architectures from Intel. Although SYCL showed excellent functional portability in hybrid CPU–GPU configurations, performance varied significantly based on specific hardware combinations. Some performance limitations were identified in multi-GPU and CPU–GPU configurations, primarily attributed to workload distribution strategies rather than SYCL-specific constraints. These findings position SYCL as a promising unified programming model for heterogeneous computing environments, particularly for bioinformatic applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.