International Conference on Partitioned Global Address Space Programming Models最新文献_第3页

HabaneroUPC++: a Compiler-free PGAS Library habaneroupc++:一个无需编译器的PGAS库

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676879

Vivek Kumar, Yili Zheng, Vincent Cavé, Zoran Budimlic, Vivek Sarkar

{"title":"HabaneroUPC++: a Compiler-free PGAS Library","authors":"Vivek Kumar, Yili Zheng, Vincent Cavé, Zoran Budimlic, Vivek Sarkar","doi":"10.1145/2676870.2676879","DOIUrl":"https://doi.org/10.1145/2676870.2676879","url":null,"abstract":"The Partitioned Global Address Space (PGAS) programming models combine shared and distributed memory features, providing the basis for high performance and high productivity parallel programming environments. UPC++ [39] is a very recent PGAS implementation that takes a library-based approach and avoids the complexities associated with compiler transformations. However, this implementation does not support dynamic task parallelism and only relies on other threading models (e.g., OpenMP or pthreads) for exploiting parallelism within a PGAS place.\u0000 In this paper, we introduce a compiler-free PGAS library called HabaneroUPC++, which supports a tighter integration of intra-place and inter-place parallelism than standard hybrid programming approaches. The library makes heavy use of C++11 lambda functions in its APIs. C++11 lambdas avoid the need for compiler support while still retaining the syntactic convenience of language-based approaches. The HabaneroUPC++ library implementation is based on a tight integration of the UPC++ library and the Habanero-C++ library, with new extensions to support the integration. The UPC++ library is used to provide PGAS communication and function shipping support using GASNet, and the Habanero-C++ library is used to provide support for intra-place work-stealing integrated with function shipping. We demonstrate the programmability and performance of our implementation using two benchmarks, scaled up to 6K cores. The insights developed in this paper promise to further enhance the usability and popularity of PGAS programming models.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131517136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Development and Extension of Atomic Memory Operations in OpenSHMEM OpenSHMEM中原子内存操作的开发和扩展

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676891

Pavel Shamis, Manjunath Gorentla Venkata, S. Poole, S. Pophale, Mike Dubman, R. Graham, Dror Goldenberg, G. Shainer

引用次数: 0

Evaluation of PGAS Communication Paradigms with Geometric Multigrid 基于几何多重网格的PGAS通信范式评价

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676874

H. Shan, A. Kamil, Samuel Williams, Yili Zheng, K. Yelick

{"title":"Evaluation of PGAS Communication Paradigms with Geometric Multigrid","authors":"H. Shan, A. Kamil, Samuel Williams, Yili Zheng, K. Yelick","doi":"10.1145/2676870.2676874","DOIUrl":"https://doi.org/10.1145/2676870.2676874","url":null,"abstract":"Partitioned Global Address Space (PGAS) languages and one-sided communication enable application developers to select the communication paradigm that balances the performance needs of applications with the productivity desires of programmers. In this paper, we evaluate three different one-sided communication paradigms in the context of geometric multigrid using the miniGMG benchmark. Although miniGMG's static, regular, and predictable communication does not exploit the ultimate potential of PGAS models, multigrid solvers appear in many contemporary applications and represent one of the most important communication patterns. We use UPC++, a PGAS extension of C++, as the vehicle for our evaluation, though our work is applicable to any of the existing PGAS languages and models. We compare performance with the highly tuned MPI baseline, and the results indicate that the most promising approach towards achieving performance and ease of programming is to use high-level abstractions, such as the multidimensional arrays provided by UPC++, that hide data aggregation and messaging in the runtime library.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124331845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

One-Sided Append: A New Communication Paradigm For PGAS Models 单侧追加:PGAS模型的一种新的通信范式

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2014-10-06 DOI: 10.1145/2676870.2676886

James Dinan, Mario Flajslik

引用次数: 0

Hiding latency in Coarray Fortran 2.0 隐藏Coarray Fortran 2.0中的延迟

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020387

William N. Scherer, L. Adhianto, G. Jin, J. Mellor-Crummey, Chaoran Yang

引用次数: 11

An open-source compiler and runtime implementation for Coarray Fortran Coarray Fortran的开源编译器和运行时实现

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020386

Deepak Eachempati, H. Jun, B. Chapman

引用次数: 23

Performance modeling for multilevel communication in SHMEM+ SHMEM+中多层通信的性能建模

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020380

V. Aggarwal, C. Yoon, A. George, H. Lam, G. Stitt

{"title":"Performance modeling for multilevel communication in SHMEM+","authors":"V. Aggarwal, C. Yoon, A. George, H. Lam, G. Stitt","doi":"10.1145/2020373.2020380","DOIUrl":"https://doi.org/10.1145/2020373.2020380","url":null,"abstract":"The field of high-performance computing (HPC) is currently undergoing a major transformation brought upon by a variety of new processor device technologies. Accelerator devices (e.g. FPGA, GPU) are becoming increasingly popular as coprocessors in HPC, embedded, and other systems, improving application performance while in some cases also reducing energy consumption. The presence of such devices introduces additional levels of communication and memory hierarchy in the system, which warrants an expansion of conventional parallel-programming practices to address these differences. Programming models and libraries for heterogeneous, parallel, and reconfigurable computing such as SHMEM+ have been developed to support communication and coordination involving a diverse mix of processor devices. However, to evaluate the impact of communication on application performance and obtain optimal performance, a concrete understanding of the underlying communication infrastructure is often imperative. In this paper, we introduce a new multilevel communication model for representing various data transfers encountered in these systems and for predicting performance. Three use cases are presented and evaluated. First, the model enables application developers to perform early design-space exploration of communication patterns in their applications before undertaking the laborious and expensive process of implementation, yielding improved performance and productivity. Second, the model enables system developers to quickly optimize performance of data-transfer routines within tools such as SHMEM+ when being ported to a new platform. Third, the model augments tools such as SHMEM+ to automatically improve performance of data transfers by self-tuning internal parameters to match platform capabilities. Results from experiments with these use cases suggest marked improvement in performance, productivity, and portability.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115142949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improving UPC productivity via integrated development tools 通过集成开发工具提高UPC生产力

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020381

Max Billingsley, Beth Tibbitts, A. George

{"title":"Improving UPC productivity via integrated development tools","authors":"Max Billingsley, Beth Tibbitts, A. George","doi":"10.1145/2020373.2020381","DOIUrl":"https://doi.org/10.1145/2020373.2020381","url":null,"abstract":"In the world of high-performance computing (HPC), there has been an increased focus in recent years upon the importance of productivity in HPC application development. One crucial aspect of productivity is the programming model used, and the family of partitioned global-address-space (PGAS) models, such as UPC and X10, has served to advance the state of the art in balancing performance and productivity. Also of great importance is the variety of development tools used to support activities such as editing, debugging, and optimizing programs. These tools are often most useful as part of an integrated development environment (IDE). While some progress has been made towards bringing IDE capabilities into the HPC world, in particular by way of Eclipse projects, support has mainly focused on MPI and OpenMP tools.\u0000 In this paper, we present research and development activities that are bringing Eclipse-based IDE capabilities to the PGAS developer community. We focus on tools for UPC, giving background on previously existing capabilities to work with UPC programs in Eclipse and then presenting a tool-chain and project wizard for the open-source Berkeley UPC compiler, basic UPC static analysis tools, and integration of our performance analysis tool (Parallel Performance Wizard) supporting UPC. Finally, we conclude by proposing future work and providing recommendations for further integration of UPC and other PGAS tools to enhance overall developer productivity.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122106421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Hybrid PGAS runtime support for multicore nodes 多核节点的混合PGAS运行时支持

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020376

F. Blagojevic, Paul H. Hargrove, Costin Iancu, K. Yelick

{"title":"Hybrid PGAS runtime support for multicore nodes","authors":"F. Blagojevic, Paul H. Hargrove, Costin Iancu, K. Yelick","doi":"10.1145/2020373.2020376","DOIUrl":"https://doi.org/10.1145/2020373.2020376","url":null,"abstract":"With multicore processors as the standard building block for high performance systems, parallel runtime systems need to provide excellent performance on shared memory, distributed memory, and hybrids. Conventional wisdom suggests that threads should be used as the runtime mechanism within shared memory, and two runtime versions for shared and distributed memory are often designed and implemented separately, retrofitting after the fact for hybrid systems. In this paper we consider the problem of implementing a runtime layer for Partitioned Global Address Space (PGAS) languages, which offer a uniform programming abstraction for hybrid machines. We present a new process-based shared memory runtime and compare it to our previous pthreads implementation. Both are integrated with the GASNet communication layer, and they can co-exist with one another. We evaluate the shared memory runtime approaches, showing that they interact in important and sometimes surprising ways with the communication layer. Using a set of microbenchmarks and application level benchmarks on an IBM BG/P, Cray XT, and InfiniBand cluster, we show that threads, processes and combinations of both are needed for maximum performance. Our new runtime shows speedups of over 60% for application benchmarks and 100% for collective communication benchmarks, when compared to the previous implementation. Our work primarily targets PGAS languages, but some of the lessons are relevant to other parallel runtime systems and libraries.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122785000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 38

Asynchronous PGAS runtime for Myrinet networks 用于Myrinet网络的异步PGAS运行时

International Conference on Partitioned Global Address Space Programming Models Pub Date : 2010-10-12 DOI: 10.1145/2020373.2020377

Montse Farreras, G. Almási

引用次数: 8