2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)最新文献

TMbarrier: Speculative Barriers Using Hardware Transactional Memory 使用硬件事务性内存的推测屏障

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-11-15 DOI: 10.1109/PDP2018.2018.00036

Manuel Pedrero, E. Gutiérrez, O. Plata

引用次数: 1

A Generic Learning Multi-agent-System Approach for Spatio-Temporal-, Thermal- and Energy-Aware Scheduling 时空、热和能量感知调度的通用学习多智能体系统方法

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-06-07 DOI: 10.1109/PDP2018.2018.00010

Christina Herzog, J. Pierson

引用次数: 1

Evaluating the Effect of Multi-Tenancy Patterns in Containerized Cloud-Hosted Content Management System 评估多租户模式在容器化云托管内容管理系统中的效果

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-06-07 DOI: 10.1109/PDP2018.2018.00047

A. Adewojo, J. Bass

引用次数: 4

Developing and Using a Geometric Multigrid, Unstructured Grid Mini-Application to Assess Many-Core Architectures 开发和使用几何多网格、非结构化网格小型应用程序来评估多核体系结构

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-06-06 DOI: 10.1109/PDP2018.2018.00018

A. Owenson, Steven A. Wright, Richard A. Bunt, S. Jarvis, Y. Ho, Matthew J. Street

{"title":"Developing and Using a Geometric Multigrid, Unstructured Grid Mini-Application to Assess Many-Core Architectures","authors":"A. Owenson, Steven A. Wright, Richard A. Bunt, S. Jarvis, Y. Ho, Matthew J. Street","doi":"10.1109/PDP2018.2018.00018","DOIUrl":"https://doi.org/10.1109/PDP2018.2018.00018","url":null,"abstract":"Achieving high-performance of large scientific codes is a difficult task. This has led to the development of numerous mini-applications that are more tractable to analyse, while retaining performance characteristics of their full-sized counterparts. These \"mini-apps\" also enable faster hardware evaluation, and for sensitive codes allow evaluation of systems outside of access approval processes. In this paper we develop a mini-application of a geometric multigrid, unstructured grid Computational Fluid Dynamics (CFD) code, designed to exhibit similar performance characteristics without sharing code. We detail our experiences developing this application, using guidelines detailed in existing research, and contribute further additions to these to aid future mini-application developers. Our application is validated against the inviscid flux routine of HYDRA, a CFD code developed by Rolls-Royce, which confirms that the parent kernel and mini-application share fundamental causes of parallel inefficiency. We then use the mini-application to assess the impact of Intel's Knights Landing (KNL) on performance. We find that the mini-app and parent kernel continue to share scaling characteristics, however a comparison with Broadwell performance exposed significant differences between the kernels that were undetected by the validation.","PeriodicalId":333367,"journal":{"name":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125148908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Improving Availability in Distributed Tuple Spaces Via Sharing Abstractions and Replication Strategies 通过共享抽象和复制策略提高分布式元组空间的可用性

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00052

V. Buravlev, R. Nicola, Alberto Lluch-Lafuente, C. A. Mezzina

引用次数: 0

A Parallel Implementation of WAND on GPUs WAND在gpu上的并行实现

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00011

Roussian R. A. Gaioso, V. Gil-Costa, H. Guardia, H. Senger

{"title":"A Parallel Implementation of WAND on GPUs","authors":"Roussian R. A. Gaioso, V. Gil-Costa, H. Guardia, H. Senger","doi":"10.1109/PDP2018.2018.00011","DOIUrl":"https://doi.org/10.1109/PDP2018.2018.00011","url":null,"abstract":"In this paper we propose and evaluate new strategies for the parallel top-k query processing on GPUs. Our strategies are based on the document-at-a-time approach and have been implemented and tested with the WAND ranking algorithm. In our first strategy (named homogeneous), the posting lists are evenly partitioned among thread blocks. Our second algorithm, named heterogeneous, partitions the posting lists according to document identifier intervals, thus partitions may have different sizes. We also propose three threshold sharing policies, named Local, Safe-R and Safe-WR, which emulate the WAND algorithm global pruning technique. We evaluated our proposals using AND/OR queries, and the results show that the homogeneous algorithm allows better speedups through higher occupancy of the SMs, but at the cost of a lower recall. The heterogeneous algorithm produces the exact top-k documents and shows promising speedups. Also, the Shared-R and Shared-WR policies for threshold propagation allowed better performance, provided there is enough amount of work per thread block, which proved true for queries composed of at least a few millions documents.","PeriodicalId":333367,"journal":{"name":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115064726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Efficient NAS Benchmark Kernels with C++ Parallel Programming 使用c++并行编程的高效NAS基准内核

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00120

Dalvan Griebler, Junior Loff, G. Mencagli, M. Danelutto, L. G. Fernandes

引用次数: 30

Divisible Load Scheduling of Image Processing Applications on the Heterogeneous Star Network Using a new Genetic Algorithm 基于新遗传算法的异构星型网络图像处理可分负载调度

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00019

S. Aali, H. Shahhoseini, N. Bagherzadeh

引用次数: 16

Performance Evaluation of the Metadata-Driven MASi Research Data Management Repository Service 元数据驱动的MASi研究数据管理存储库服务的性能评价

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00059

Richard Grunzke, Volker Hartmann, T. Jejkal, H. Kollai, C. Dressler, Julia Dolhoff, Julia Stanek, H. Herold, A. Hoffmann, R. Müller-Pfefferkorn, Torsten Schrade, S. Herres‐Pawlis, G. Meinel, W. Nagel

引用次数: 0

Characterizing Memory-Latency Sensitivity of Sparse Matrix Kernels 稀疏矩阵核的内存延迟敏感性研究

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2018-03-21 DOI: 10.1109/PDP2018.2018.00042

N. Tanabe, Toshio Endo

{"title":"Characterizing Memory-Latency Sensitivity of Sparse Matrix Kernels","authors":"N. Tanabe, Toshio Endo","doi":"10.1109/PDP2018.2018.00042","DOIUrl":"https://doi.org/10.1109/PDP2018.2018.00042","url":null,"abstract":"Intel announced to launch a Xeon with high-latency main memory based on 3D Xpoint in 2018. This paper presents the performance evaluation of sparse matrix kernels on the future supercomputers with high-latency main memory such as 3D Xpoint. The authors propose a high throughput evaluation methodology for exhaustive experiments, which use the University of Florida sparse matrix collection and/or LIS (a Library of Iterative Solvers for linear systems) etc. Proposed methodology is very simple to use, highly flexible for environment and high-throughput. Latency sensitivity of SpMV is measured based on the proposed methodology with 208 sparse matrices and ten storage formats only in two days, which would take for about ten years by conventional simulators. We got several interesting knowledge about latency-sensitive kernels, sparse matrices, storage formats, and preconditioners, etc. We observed notable latency sensitivity in some applications, which are Graph500, HPCG and a part of preconditioners of iterative solvers. We found latency sensitivities of SpMV are high for larger matrices than the capacity of last level cache. This suggests main memory using 3D Xpoint must be combined with large DRAM cache.","PeriodicalId":333367,"journal":{"name":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115045005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0