International Conference on Partitioned Global Address Space Programming Models最新文献

筛选
英文 中文
Efficient Interoperability of OpenSHMEM on Multicore Architectures OpenSHMEM在多核体系结构上的高效互操作性
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676889
K. Ibrahim
{"title":"Efficient Interoperability of OpenSHMEM on Multicore Architectures","authors":"K. Ibrahim","doi":"10.1145/2676870.2676889","DOIUrl":"https://doi.org/10.1145/2676870.2676889","url":null,"abstract":"Most HPC programming models face an interoperability challenge because of the advent of multi/many core architectures [1, 2, 3]. Efficient interoperability--for instance, with shared memory programming models such as OpenMP--requires reconsidering the design of various levels of the programming model software stack. While support for interoperability typically exists at the hardware and system messaging library levels, most programming models lack the interfaces that ease such interoperability. In this paper, we discuss requirements of efficient interoperability and show the alternative paths for satisfying them for OpenSHMEM. We discuss the implication of maintaining the current interfaces and enhancements to ease interoperability.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115796581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Threaded OpenSHMEM: A Bad Idea? 多线程OpenSHMEM:一个坏主意?
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676890
Gabriele Jost, U. Hanebutte, James Dinan
{"title":"Multi-Threaded OpenSHMEM: A Bad Idea?","authors":"Gabriele Jost, U. Hanebutte, James Dinan","doi":"10.1145/2676870.2676890","DOIUrl":"https://doi.org/10.1145/2676870.2676890","url":null,"abstract":"The purpose of this document is to stimulate discussions on support for multi-threaded execution in OpenSHMEM. Why is there a need for any thread support at all for an API that follows a shared global address space paradigm? In our ongoing work, we investigate opportunities and challenges introduced through multi-threading, namely implementation challenges and opportunities and required -- as well desirable -- extensions to the API.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128168719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards a matrix-oriented strided interface in OpenSHMEM 在OpenSHMEM中实现面向矩阵的跨行接口
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676888
J. Hammond
{"title":"Towards a matrix-oriented strided interface in OpenSHMEM","authors":"J. Hammond","doi":"10.1145/2676870.2676888","DOIUrl":"https://doi.org/10.1145/2676870.2676888","url":null,"abstract":"New communication routines are proposed for OpenSHMEM to allow the efficient implementation of distributed matrix computations.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127249804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Experiences at scale with PGAS versions of a Hydrodynamics application 具有PGAS版本流体力学应用程序的大规模经验
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676873
A. Mallinson, S. Jarvis, W. Gaudin, J. Herdman
{"title":"Experiences at scale with PGAS versions of a Hydrodynamics application","authors":"A. Mallinson, S. Jarvis, W. Gaudin, J. Herdman","doi":"10.1145/2676870.2676873","DOIUrl":"https://doi.org/10.1145/2676870.2676873","url":null,"abstract":"In this work we directly evaluate two PGAS programming models, CAF and OpenSHMEM, as candidate technologies for improving the performance and scalability of scientific applications on future exascale HPC platforms. PGAS approaches are considered by many to represent a promising research direction with the potential to solve some of the existing problems preventing codebases from scaling to exascale levels of performance. The aim of this work is to better inform the exacsale planning at large HPC centres such as AWE. Such organisations invest significant resources maintaining and updating existing scientific codebases, many of which were not designed to run at the scales required to reach exascale levels of computational performance on future system architectures. We document our approach for implementing a recently developed Lagrangian-Eulerian explicit hydrodynamics mini-application in each of these PGAS languages. Furthermore, we also present our results and experiences from scaling these different approaches to high node counts on two state-of-the-art, large scale system architectures from Cray (XC30) and SGI (ICE-X), and compare their utility against an equivalent existing MPI implementation.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122976253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Native Mode-Based Optimizations of Remote Memory Accesses in OpenSHMEM for Intel Xeon Phi Intel Xeon Phi处理器OpenSHMEM中基于本地模式的远程内存访问优化
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676881
N. Namashivayam, Sayan Ghosh, Dounia Khaldi, Deepak Eachempati, B. Chapman
{"title":"Native Mode-Based Optimizations of Remote Memory Accesses in OpenSHMEM for Intel Xeon Phi","authors":"N. Namashivayam, Sayan Ghosh, Dounia Khaldi, Deepak Eachempati, B. Chapman","doi":"10.1145/2676870.2676881","DOIUrl":"https://doi.org/10.1145/2676870.2676881","url":null,"abstract":"OpenSHMEM is a PGAS library that aims to deliver high performance while retaining portability. Communication operations are a major obstacle to scalable parallel performance and are highly dependent on the target architecture. However, to date there has been no work on how to efficiently support OpenSHMEM running natively on Intel Xeon Phi, a highly-parallel, power-efficient and widely-used many-core architecture. Given the importance of communication in parallel architectures, this paper describes a novel methodology for optimizing remote-memory accesses for execution of OpenSHMEM programs on Intel Xeon Phi processors.\u0000 In native mode, we can exploit the Xeon Phi shared memory and convert OpenSHMEM one-sided communication calls into local load/store statements using the shmem_ptr routine. This approach makes it possible for the compiler to perform essential optimizations for Xeon Phi such as vectorization. To the best of our knowledge, this is the first time the impact of shmem_ptr is analyzed thoroughly on a many-core system. We show the benefits of this approach on the PGAS-Microbenchmarks we specifically developed for this research. Our results exhibit a decrease in latency for one-sided communication operations by up to 60% and increase in bandwidth by up to 12x. Moreover, we study different reduction algorithms and exploit local load/store to optimize data transfers in these algorithms for Xeon Phi which permits improvement of up to 22% compared to MVAPICH and up to 60% compared to Intel MPI. Apart from microbenchmarks, experimental results on NAS IS and SP benchmarks show that performance gains of up to 20x are possible.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127132497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
DART-MPI: An MPI-based Implementation of a PGAS Runtime System 基于mpi的PGAS运行时系统实现
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676875
Huan Zhou, Yousri Mhedheb, K. Idrees, C. W. Glass, J. Gracia, K. Fürlinger, J. Tao
{"title":"DART-MPI: An MPI-based Implementation of a PGAS Runtime System","authors":"Huan Zhou, Yousri Mhedheb, K. Idrees, C. W. Glass, J. Gracia, K. Fürlinger, J. Tao","doi":"10.1145/2676870.2676875","DOIUrl":"https://doi.org/10.1145/2676870.2676875","url":null,"abstract":"A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126817197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
OpenSHMEM Reference Implementation using UCCS-uGNI Transport Layer 使用UCCS-uGNI传输层的OpenSHMEM参考实现
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676892
T. Janjusic, Pavel Shamis, Manjunath Gorentla Venkata, S. Poole
{"title":"OpenSHMEM Reference Implementation using UCCS-uGNI Transport Layer","authors":"T. Janjusic, Pavel Shamis, Manjunath Gorentla Venkata, S. Poole","doi":"10.1145/2676870.2676892","DOIUrl":"https://doi.org/10.1145/2676870.2676892","url":null,"abstract":"OpenSHMEM is a library interface implementation and specification that enables the implementation of the Partitioned Global Address Space (PGAS) model. It exports modern RDMA network functionality and communication semantics to applications very efficiently. There are many closed source implementations of OpenSHMEM for modern RDMA interconnects such as InfiniBand and Cray's Gemini and Aries. Given the important role that Cray systems play in HPC, in this paper, we present an open source implementation of OpenSHMEM for Cray XE/XK/XC systems.\u0000 To implement OpenSHMEM, we use the uGNI interface. uGNI is a generic interface that is designed for multiple programming models. The interface fits well the goal of UCCS. Having OpenSHMEM with UCCS-uGNI allows usage of the same implementation over multiple interconnects. This also translates into many advantages that come with common code such as resource sharing, increasing productivity because of less code maintenance, etc. Preliminary results show that OpenSHMEM-UCCS performs comparable to state-of-the-art Cray SHMEM for Put, Get, and AMO operations.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125034742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Extending the OpenSHMEM Memory Model to Support User-Defined Spaces 扩展OpenSHMEM内存模型以支持用户定义的空间
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676884
A. Welch, S. Pophale, Pavel Shamis, Oscar R. Hernandez, S. Poole, B. Chapman
{"title":"Extending the OpenSHMEM Memory Model to Support User-Defined Spaces","authors":"A. Welch, S. Pophale, Pavel Shamis, Oscar R. Hernandez, S. Poole, B. Chapman","doi":"10.1145/2676870.2676884","DOIUrl":"https://doi.org/10.1145/2676870.2676884","url":null,"abstract":"OpenSHMEM is an open standard for SHMEM libraries. With the standardisation process complete, the community is looking towards extending the API for increasing programmer flexibility and extreme scalability. According to the current OpenSHMEM specification (revision 1.1), allocation of symmetric memory is collective across all PEs executing the application. For better work sharing and memory utilisation, we are proposing the concepts of teams and spaces for OpenSHMEM that together allow allocation of memory only across user-specified teams. Through our implementation we show that by using teams we can confine memory allocation and usage to only the PEs that actually communicate via symmetric memory. We provide our preliminary results that demonstrate creating spaces for teams allows for less consumption of memory resources than the current alternative. We also examine the impact of our extensions on Scalable Synthetic Compact Applications #3 (SSCA3), which is a sensor processing and knowledge formation kernel involving file I/O, and show that up to 30% of symmetric memory allocation can be eliminated without affecting the correctness of the benchmark.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123035019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Contexts: A Mechanism for High Throughput Communication in OpenSHMEM 上下文:OpenSHMEM中的高吞吐量通信机制
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676872
James Dinan, Mario Flajslik
{"title":"Contexts: A Mechanism for High Throughput Communication in OpenSHMEM","authors":"James Dinan, Mario Flajslik","doi":"10.1145/2676870.2676872","DOIUrl":"https://doi.org/10.1145/2676870.2676872","url":null,"abstract":"This paper introduces a proposed extension to the OpenSHMEM parallel programming model, called communication contexts. Contexts introduce a new construct that allows a programmer to generate independent streams of communication operations. In hybrid executions where multiple threads execute within an OpenSHMEM process, contexts eliminate interference between threads, and enable the OpenSHMEM library to map operations generated by threads to private communication resource sets. By providing thread isolation, contexts eliminate synchronization overheads and enable each thread to drive a similar set of resources and achieve performance comparable to an OpenSHMEM process. In conventional, single-threaded execution, contexts provide greater control over ordering of operations and can improve communication and computation overlap. A detailed description of the contexts interface and its implementation for the Portals 4 network programming interface is described. The implementation is evaluated using Mandelbrot set and integer sorting (IS) benchmarks. Contexts provide a 25% performance improvement for Mandelbrot by eliminating thread interference and enabling pipelining, and a 35% improvement was achieved for IS by enabling more effective communication/computation overlap.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133662093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
A Heterogeneous GASNet Implementation for FPGA-accelerated Computing fpga加速计算的异构GASNet实现
International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676885
Ruediger Willenberg, P. Chow
{"title":"A Heterogeneous GASNet Implementation for FPGA-accelerated Computing","authors":"Ruediger Willenberg, P. Chow","doi":"10.1145/2676870.2676885","DOIUrl":"https://doi.org/10.1145/2676870.2676885","url":null,"abstract":"This paper introduces an effort to incorporate reconfigurable logic (FPGA) components into the Partitioned Global Address Space model. For this purpose, we have implemented a heterogeneous implementation of GASNet that supports distributed applications with software and hardware components and easy migration of kernels from software to hardware. We present a use case and preliminary performance numbers.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134437673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信