{"title":"探讨CP2K在CPU-GPDSP融合异构内高性能计算系统上的性能","authors":"Qi Du , Feng Wang , Hui Huang","doi":"10.1016/j.future.2025.107912","DOIUrl":null,"url":null,"abstract":"<div><div>This study explores the performance of CP2K on a heterogeneous HPC system integrating CPU and GPDSP, aiming to optimize computational efficiency for large-scale molecular simulations. CP2K is an open-source software package designed for simulating condensed matter systems, particularly excelling in handling complex quantum chemistry and molecular dynamics workloads. We present the integration of CPU and GPDSP in a heterogeneous processor environment, detailing key optimizations, including vectorization of integral operations in Density Functional Theory (DFT) and GEMM optimization based on processor memory architecture. Furthermore, we propose a parallel computing strategy tailored to the hardware’s architectural characteristics to maximize performance. Benchmarking results using the CP2K test suite demonstrate significant computational and parallel efficiency gains. For instance, in a water molecule simulation, the system achieves 79% parallel efficiency when scaled to 256 compute nodes, utilizing approximately 400,000 cores. Finally, we conduct a comparative performance analysis between CPU-GPDSP and AVX-512 vector processors, highlighting the advantages and potential limitations of GPDSP acceleration in heterogeneous HPC environments.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"174 ","pages":"Article 107912"},"PeriodicalIF":6.2000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the performance of CP2K simulations on the CPU-GPDSP Fusion intra-heterogeneous HPC system\",\"authors\":\"Qi Du , Feng Wang , Hui Huang\",\"doi\":\"10.1016/j.future.2025.107912\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study explores the performance of CP2K on a heterogeneous HPC system integrating CPU and GPDSP, aiming to optimize computational efficiency for large-scale molecular simulations. CP2K is an open-source software package designed for simulating condensed matter systems, particularly excelling in handling complex quantum chemistry and molecular dynamics workloads. We present the integration of CPU and GPDSP in a heterogeneous processor environment, detailing key optimizations, including vectorization of integral operations in Density Functional Theory (DFT) and GEMM optimization based on processor memory architecture. Furthermore, we propose a parallel computing strategy tailored to the hardware’s architectural characteristics to maximize performance. Benchmarking results using the CP2K test suite demonstrate significant computational and parallel efficiency gains. For instance, in a water molecule simulation, the system achieves 79% parallel efficiency when scaled to 256 compute nodes, utilizing approximately 400,000 cores. Finally, we conduct a comparative performance analysis between CPU-GPDSP and AVX-512 vector processors, highlighting the advantages and potential limitations of GPDSP acceleration in heterogeneous HPC environments.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"174 \",\"pages\":\"Article 107912\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25002079\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25002079","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Exploring the performance of CP2K simulations on the CPU-GPDSP Fusion intra-heterogeneous HPC system
This study explores the performance of CP2K on a heterogeneous HPC system integrating CPU and GPDSP, aiming to optimize computational efficiency for large-scale molecular simulations. CP2K is an open-source software package designed for simulating condensed matter systems, particularly excelling in handling complex quantum chemistry and molecular dynamics workloads. We present the integration of CPU and GPDSP in a heterogeneous processor environment, detailing key optimizations, including vectorization of integral operations in Density Functional Theory (DFT) and GEMM optimization based on processor memory architecture. Furthermore, we propose a parallel computing strategy tailored to the hardware’s architectural characteristics to maximize performance. Benchmarking results using the CP2K test suite demonstrate significant computational and parallel efficiency gains. For instance, in a water molecule simulation, the system achieves 79% parallel efficiency when scaled to 256 compute nodes, utilizing approximately 400,000 cores. Finally, we conduct a comparative performance analysis between CPU-GPDSP and AVX-512 vector processors, highlighting the advantages and potential limitations of GPDSP acceleration in heterogeneous HPC environments.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.