{"title":"实现基于 PGAS 的可扩展高效分布式 OpenMP","authors":"Baodi Shan, Mauricio Araya-Polo, Barbara Chapman","doi":"arxiv-2409.02830","DOIUrl":null,"url":null,"abstract":"MPI+X has been the de facto standard for distributed memory parallel\nprogramming. It is widely used primarily as an explicit two-sided communication\nmodel, which often leads to complex and error-prone code. Alternatively, PGAS\nmodel utilizes efficient one-sided communication and more intuitive\ncommunication primitives. In this paper, we present a novel approach that\nintegrates PGAS concepts into the OpenMP programming model, leveraging the LLVM\ncompiler infrastructure and the GASNet-EX communication library. Our model\naddresses the complexity associated with traditional MPI+OpenMP programming\nmodels while ensuring excellent performance and scalability. We evaluate our\napproach using a set of micro-benchmarks and application kernels on two\ndistinct platforms: Ookami from Stony Brook University and NERSC Perlmutter.\nThe results demonstrate that DiOMP achieves superior bandwidth and lower\nlatency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on\nlatency. DiOMP offers a promising alternative to the traditional MPI+OpenMP\nhybrid programming model, towards providing a more productive and efficient way\nto develop high-performance parallel applications for distributed memory\nsystems.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards a Scalable and Efficient PGAS-based Distributed OpenMP\",\"authors\":\"Baodi Shan, Mauricio Araya-Polo, Barbara Chapman\",\"doi\":\"arxiv-2409.02830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MPI+X has been the de facto standard for distributed memory parallel\\nprogramming. It is widely used primarily as an explicit two-sided communication\\nmodel, which often leads to complex and error-prone code. Alternatively, PGAS\\nmodel utilizes efficient one-sided communication and more intuitive\\ncommunication primitives. In this paper, we present a novel approach that\\nintegrates PGAS concepts into the OpenMP programming model, leveraging the LLVM\\ncompiler infrastructure and the GASNet-EX communication library. Our model\\naddresses the complexity associated with traditional MPI+OpenMP programming\\nmodels while ensuring excellent performance and scalability. We evaluate our\\napproach using a set of micro-benchmarks and application kernels on two\\ndistinct platforms: Ookami from Stony Brook University and NERSC Perlmutter.\\nThe results demonstrate that DiOMP achieves superior bandwidth and lower\\nlatency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on\\nlatency. DiOMP offers a promising alternative to the traditional MPI+OpenMP\\nhybrid programming model, towards providing a more productive and efficient way\\nto develop high-performance parallel applications for distributed memory\\nsystems.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.02830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards a Scalable and Efficient PGAS-based Distributed OpenMP
MPI+X has been the de facto standard for distributed memory parallel
programming. It is widely used primarily as an explicit two-sided communication
model, which often leads to complex and error-prone code. Alternatively, PGAS
model utilizes efficient one-sided communication and more intuitive
communication primitives. In this paper, we present a novel approach that
integrates PGAS concepts into the OpenMP programming model, leveraging the LLVM
compiler infrastructure and the GASNet-EX communication library. Our model
addresses the complexity associated with traditional MPI+OpenMP programming
models while ensuring excellent performance and scalability. We evaluate our
approach using a set of micro-benchmarks and application kernels on two
distinct platforms: Ookami from Stony Brook University and NERSC Perlmutter.
The results demonstrate that DiOMP achieves superior bandwidth and lower
latency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on
latency. DiOMP offers a promising alternative to the traditional MPI+OpenMP
hybrid programming model, towards providing a more productive and efficient way
to develop high-performance parallel applications for distributed memory
systems.