2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)最新文献_第9页

Publisher's Information 出版商的信息

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/ipdps.2019.00118

引用次数: 0

The Path to Delivering Programable Exascale Systems 提供可编程百亿亿级系统的途径

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00081

L. DeRose

{"title":"The Path to Delivering Programable Exascale Systems","authors":"L. DeRose","doi":"10.1109/IPDPS.2019.00081","DOIUrl":"https://doi.org/10.1109/IPDPS.2019.00081","url":null,"abstract":"The trends in hardware architecture are paving the road towards Exascale. However, these trends are also increasing the complexity of design and development of the software developer environment that is deployed on modern supercomputers. Moreover, the scale and complexity of high-end systems creates a new set of challenges for application developers. Computational scientists are facing system characteristics that will significantly impact the programmability and scalability of applications. In order to address these issues, software architects need to take a holistic view of the entire system and deliver a high-level programming environment that can help maximize programmability, while not losing sight of performance portability. In this talk, I will discuss the current trends in computer architecture and their implications in application development and will present Cray’s high level parallel programming environment for performance and programmability on current and future supercomputers. I will also discuss some of the challenges and open research problems that need to be addressed in order to build a software developer environment for extreme-scale systems that helps users solve multi-disciplinary and multi-scale problems with high levels of performance, programmability, and scalability.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122391308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Bin-Based Bitstream Partitioning Approach for Parallel CABAC Decoding in Next Generation Video Coding 下一代视频编码中并行CABAC解码的一种基于bin的比特流分割方法

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00112

Philipp Habermann, C. C. Chi, M. Alvarez-Mesa, B. Juurlink

引用次数: 2

Combining Prefetch Control and Cache Partitioning to Improve Multicore Performance 结合预取控制和Cache分区，提高多核性能

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00103

Gongjin Sun, Junjie Shen, A. Veidenbaum

引用次数: 10

Slate: Enabling Workload-Aware Efficient Multiprocessing for Modern GPGPUs Slate:为现代gpgpu启用工作负载感知高效多处理

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00035

Tyler N. Allen, Xizhou Feng, Rong Ge

{"title":"Slate: Enabling Workload-Aware Efficient Multiprocessing for Modern GPGPUs","authors":"Tyler N. Allen, Xizhou Feng, Rong Ge","doi":"10.1109/IPDPS.2019.00035","DOIUrl":"https://doi.org/10.1109/IPDPS.2019.00035","url":null,"abstract":"As GPUs now contribute the majority of computing power for HPC and data centers, improving GPU utilization becomes an important research problem. Sharing GPU among multiple kernels is an effective approach but requires judicious kernel selection and scheduling for optimal gains. In this paper, we present Slate, a software-based workload-aware GPU multiprocessing framework that enables concurrent kernels from different processes to share GPU devices. Slate selects concurrent kernels that have complementary resource demands at run time to minimize interference for individual kernels and improve GPU resource utilization. Slate adjusts the size of application kernels on-the-fly so that kernels readily share, release, and claim resources based on GPU status. It further controls overhead including data transfers and synchronization. We have built a prototype of Slate and evaluated it on a system with a NVIDIA Titan Xp card. Our experiments show that Slate improves system throughput by 11% on average and up to 35% at the best scenario for the tested applications, in comparison to NVIDIA MultiProcess Service (MPS) that uses hardware scheduling and the leftover policy for resource sharing.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"2020 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114822560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

An Architecture and Stochastic Method for Database Container Placement in the Edge-Fog-Cloud Continuum 边缘-雾-云连续体中数据库容器放置的体系结构和随机方法

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00050

Petar Kochovski, R. Sakellariou, M. Bajec, P. Drobintsev, V. Stankovski

{"title":"An Architecture and Stochastic Method for Database Container Placement in the Edge-Fog-Cloud Continuum","authors":"Petar Kochovski, R. Sakellariou, M. Bajec, P. Drobintsev, V. Stankovski","doi":"10.1109/IPDPS.2019.00050","DOIUrl":"https://doi.org/10.1109/IPDPS.2019.00050","url":null,"abstract":"Databases as software components may be used to serve a variety of smart applications. Currently, the Internet of Things (IoT), Artificial Intelligence (AI) and Cloud technologies are used in the course of projects such as the Horizon 2020 EU-Korea DECENTER project in order to implement four smart applications in the domains of Smart Homes, Smart Cities, Smart Construction and Robot Logistics. In these smart applications the Big Data pipeline starts from various sensor and video streams to which AI and feature extraction methods are applied. The resulting information is stored in database containers, which have to be placed on Edge, Fog or Cloud infrastructures. The placement decision depends on complex application requirements, including Quality of Service (QoS) requirements. Information that must be considered when making placement decisions includes the expected workload, the list of candidate infrastructures, geolocation, connectivity and similar. Software engineers currently perform such decisions manually, which usually leads to QoS threshold violations. This paper aims to automate the process of making such decisions. Therefore, the goals of this paper are to: (1) develop a decision making method for database container placement; (2) formally verify each placement decision and provide probability assurances to the software engineer for high QoS; and (3) design and implement a new architecture that automates the whole process. A new optimisation method is introduced, which is based on the theory and practice of stochastic Markov Decision Processes (MDP). It uses as input monitoring data from the container runtime, the expected workload and user-related metrics in order to automatically construct a probabilistic finite automaton. The generated automaton is used for both automated decision making and placement success verification. The method is implemented in Java. It also uses the PRISM model-checking tool. Kubernetes is used in order to automate the whole process when orchestrating database containers across Edge, Fog and Cloud infrastructures. Experiments are performed for NoSQL Cassandra database containers for three representative workloads of 50000 (workload 1), 200000 (workload 2) and 500000 (workload 3) CRUD database operations. Five computing infrastructures serve as candidates for database container placement. The new MDP-based method is compared with the widely used Analytic Hierarchy Process (AHP) method. The obtained results are used to analyse container placement decisions. When using the new MDP based method there were no QoS violations in any of the placement cases, while when using the AHP based method the placement results in some QoS threshold violations in all workload cases. Due to its properties, the new MDP method is particularly suitable for implementation. The paper also describes a multi-tier distributed computing system that uses multi-level (infrastructure, container, application) monitoring metrics and Kubernetes","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115756348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

[Copyright notice] (版权)

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/ipdps.2019.00003

引用次数: 0

C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks C-GDR:基于RDMA网络的高性能容器感知GPUDirect MPI通信方案

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00034

Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, D. Panda

{"title":"C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks","authors":"Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, D. Panda","doi":"10.1109/IPDPS.2019.00034","DOIUrl":"https://doi.org/10.1109/IPDPS.2019.00034","url":null,"abstract":"In recent years, GPU-based platforms have received significant success for parallel applications. In addition to highly optimized computation kernels on GPUs, the cost of data movement on GPU clusters plays critical roles in delivering high performance for end applications. Many recent studies have been proposed to optimize the performance of GPU-or CUDA-aware communication runtimes and these designs have been widely adopted in the emerging GPU-based applications. These studies mainly focus on improving the communication performance on native environments, i.e., physical machines, however GPU-based communication schemes on cloud environments are not well studied yet. This paper first investigates the performance characteristics of state-of-the-art GPU-based communication schemes on both native and container-based environments, which show a significant demand to design high-performance container-aware communication schemes in GPU-enabled runtimes to deliver near-native performance for end applications on clouds. Next, we propose the C-GDR approach to design high-performance Container-aware GPUDirect communication schemes on RDMA networks. C-GDR allows communication runtimes to successfully detect process locality, GPU residency, NUMA, architecture information, and communication pattern to enable intelligent and dynamic selection of the best communication and data movement schemes on GPU-enabled clouds. We have integrated C-GDR with the MVAPICH2 library. Our evaluations show that MVAPICH2 with C-GDR has clear performance benefits on container-based cloud environments, compared to default MVAPICH2-GDR and Open MPI. For instance, our proposed C-GDR can outperform default MVAPICH2-GDR schemes by up to 66% on micro-benchmarks and up to 26% on HPC applications over a container-based environment.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128990522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

ParILUT - A Parallel Threshold ILU for GPUs ParILUT -用于gpu的并行门限ILU

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/IPDPS.2019.00033

H. Anzt, T. Ribizel, Goran Flegar, Edmond Chow, J. Dongarra

引用次数: 14

[Title page iii] [标题页iii]

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI: 10.1109/ipdps.2019.00002

引用次数: 0