Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores最新文献

筛选
英文 中文
Parallel Locality and Parallelization Quality 并行局部性和并行化质量
Bernard Goossens, David Parello, Katarzyna Porada, Djallal Rahmoune
{"title":"Parallel Locality and Parallelization Quality","authors":"Bernard Goossens, David Parello, Katarzyna Porada, Djallal Rahmoune","doi":"10.1145/2883404.2883410","DOIUrl":"https://doi.org/10.1145/2883404.2883410","url":null,"abstract":"This paper presents a new distributed computation model adapted to manycore processors. In this model, the run is spread on the available cores by fork machine instructions produced by the compiler, for example at function calls and loops iterations. This approach is to be opposed to the actual model of computation based on cache and predictor. Cache efficiency relies on data locality and predictor efficiency relies on the reproducibility of the control. Data locality and control reproducibility are less effective when the execution is distributed. The computation model proposed is based on a new core hardware. Its main features are described in this paper. This new core is the building block of a manycore design. The processor automatically parallelizes an execution. It keeps the computation deterministic by constructing a totally ordered trace of the machine instructions run. References are renamed, including memory, which fixes the communications and synchronizations needs. When a data is referenced, its producer is found in the trace and the reader is synchronized with the writer. This paper shows how a consumer can be located in the same core as its producer, improving parallel locality and parallelization quality. Our deterministic and fine grain distribution of a run on a manycore processor is compared with OS primitives and API based parallelization (e.g. pthread, OpenMP or MPI) and to compiler automatic parallelization of loops. The former implies (i) a high OS overhead meaning that only coarse grain parallelization is cost-effective and (ii) a non deterministic behaviour meaning that appropriate synchronization to eliminate wrong results is a challenge. The latter is unable to fully parallelize general purpose programs due to structures like functions, complex loops and branches.","PeriodicalId":185841,"journal":{"name":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116383133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-GPU implementation of the Horizontal Diffusion method of the Weather Research and Forecast Model 天气研究与预报模型水平扩散方法的多gpu实现
L. Solano-Quinde, Ronald Gualan-Saavedra, Miguel Zúñiga-Prieto
{"title":"Multi-GPU implementation of the Horizontal Diffusion method of the Weather Research and Forecast Model","authors":"L. Solano-Quinde, Ronald Gualan-Saavedra, Miguel Zúñiga-Prieto","doi":"10.1145/2883404.2883407","DOIUrl":"https://doi.org/10.1145/2883404.2883407","url":null,"abstract":"The Weather Research and Forecasting (WRF), a next generation mesoscale numerical weather prediction system, has a considerable amount of work regarding GPU acceleration. However, the amount of works exploiting multi-GPU systems is limited. This work constitutes an effort on using GPU computing over the WRF model and is focused on a computationally intensive portion of the WRF: the Horizontal Diffusion method. Particularly, this work presents the enhancements that enable a single-GPU based implementation to exploit the parallelism of multi-GPU systems. The performance of the multi-GPU and single-GPU based implementations are compared on a computational domain of 433x308 horizontal grid points with 35 vertical levels, and the resulting speedup of the kernel is 3.5x relative to one GPU. The experiments were carried out on a multi-core computer with two NVIDIA Tesla K40m GPUs.","PeriodicalId":185841,"journal":{"name":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115478842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores 第七届多核与多核编程模型与应用国际研讨会论文集
{"title":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","authors":"","doi":"10.1145/2883404","DOIUrl":"https://doi.org/10.1145/2883404","url":null,"abstract":"","PeriodicalId":185841,"journal":{"name":"Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115393812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信