Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing最新文献

Keynote Lecture : Gradient compression for efficient distributed deep learning 主题演讲:用于高效分布式深度学习的梯度压缩

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2021-07-28 DOI: 10.1109/ISPDC52870.2021.9521637

Nikos Deligiannis

引用次数: 0

Keynote Lecture : Towards Robust, Large-scale Concurrent and Distributed Programming 主题演讲:走向稳健、大规模并发和分布式编程

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2021-07-28 DOI: 10.1109/ISPDC52870.2021.9521642

Philipp Haller

引用次数: 0

Keynote Lecture : Neural circuit policies 主题演讲:神经回路策略

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2021-07-28 DOI: 10.1109/ISPDC52870.2021.9521636

R. Grosu

引用次数: 0

Keynote Lecture : Learning Representations: Opportunities for Parallel and Distributed Computing 主题演讲:学习表征:并行和分布式计算的机遇

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2021-07-28 DOI: 10.1109/ISPDC52870.2021.9521632

D. Rus

引用次数: 0

The Supercomputer "Fugaku" and Arm-SVE enabled A64FX processor for energy-efficiency and sustained application performance 超级计算机“Fugaku”和Arm-SVE支持的A64FX处理器具有能效和持续的应用性能

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2020-07-01 DOI: 10.1109/ISPDC51135.2020.00009

M. Sato

引用次数: 10

Security Applications of GPUs gpu的安全应用

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2019-03-13 DOI: 10.5772/INTECHOPEN.81885

G. Vasiliadis

{"title":"Security Applications of GPUs","authors":"G. Vasiliadis","doi":"10.5772/INTECHOPEN.81885","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.81885","url":null,"abstract":"Despite the recent advances in software security hardening techniques, vulner-abilities can always be exploited if the attackers are really determined. Regardless the protection enabled, successful exploitation can always be achieved, even though admittedly, today, it is much harder than it was in the past. Since securing software is still under ongoing research, the community investigates detection methods in order to protect software. Three of the most promising such methods are monitoring the (i) network, (ii) the filesystem, and (iii) the host memory, for possible exploitation. Whenever a malicious operation is detected then the monitor should be able to terminate it and/or alert the administrator. In this chapter, we explore how to utilize the highly parallel capabilities of modern commodity graphics processing units (GPUs) in order to improve the performance of different security tools operating at the network, storage, and memory level, and how they can offload the CPU whenever possible. Our results show that modern GPUs can be very efficient and highly effective at accelerating the pattern matching operations of network intrusion detection systems and antivirus tools, as well as for monitoring the integrity of the base computing systems.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73609684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Introductory Chapter: High Performance Parallel Computing 导论:高性能并行计算

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2019-02-04 DOI: 10.5772/intechopen.84193

Satyadhyan Chickerur

引用次数: 0

Design and Implementation of Particle Systems for Meshfree Methods with High Performance 高性能无网格法粒子系统的设计与实现

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2018-12-31 DOI: 10.5772/INTECHOPEN.81755

G. Bilotta, V. Zago, A. Hérault

{"title":"Design and Implementation of Particle Systems for Meshfree Methods with High Performance","authors":"G. Bilotta, V. Zago, A. Hérault","doi":"10.5772/INTECHOPEN.81755","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.81755","url":null,"abstract":"Particle systems, commonly associated with computer graphics, animation, and video games, are an essential component in the implementation of numerical methods ranging from the meshfree methods for computational fluid dynamics and related applications (e.g., smoothed particle hydrodynamics, SPH) to minimization methods for arbitrary problems (e.g., particle swarm optimization, PSO). These methods are frequently embarrassingly parallel in nature, making them a natural fit for implementation on massively parallel computational hardware such as modern graphics processing units (GPUs). However, naive implementations fail to fully exploit the capabilities of this hardware. We present practical solutions to the challenges faced in the efficient parallel implementation of these particle systems, with a focus on performance, robustness, and flexibility. The techniques are illustrated through GPUSPH, the first implementation of SPH to run completely on GPU, and currently supporting multi-GPU clusters, uniform precision independent of domain size, and multiple SPH formulations.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85387904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Characterizing Power and Energy Efficiency of Legion Data-Centric Runtime and Applications on Heterogeneous High-Performance Computing Systems 异构高性能计算系统中以数据为中心的运行时的功率和能源效率特征

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2018-12-10 DOI: 10.5772/INTECHOPEN.81124

Song Huang, Song Fu, S. Pakin, M. Lang

{"title":"Characterizing Power and Energy Efficiency of Legion Data-Centric Runtime and Applications on Heterogeneous High-Performance Computing Systems","authors":"Song Huang, Song Fu, S. Pakin, M. Lang","doi":"10.5772/INTECHOPEN.81124","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.81124","url":null,"abstract":"The traditional parallel programming models require programmers to explicitly specify parallelism and data movement of the underlying parallel mechanisms. Different from the traditional computation-centric programming, Legion provides a data-centric programming model for extracting parallelism and data movement. In this chapter, we aim to characterize the power and energy consumption of running HPC applications on Legion. We run benchmark applications on compute nodes equipped with both CPU and GPU, and measure the execution time, power consumption and CPU/GPU utilization. Additionally, we test the message passing interface (MPI) version of these applications and compare the performance and power consumption of high-performance computing (HPC) applications using the computation-centric and data-centric programming models. Experimental results indicate Legion applications outperforms MPI applications on both performance and energy efficiency, i.e., Legion applications can be 9.17 times as fast as MPI applications and use only 9.2% energy. Legion effectively explores the heterogeneous architecture and runs applications tasks on GPU. As far as we know, this is the first study to understand the power and energy consumption of Legion programming and runtime infrastructure. Our findings will enable HPC system designers and operators to develop and tune the performance of data-centric HPC applications with constraints on power and energy consumption.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73114387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Particle-Based Fused Rendering 基于粒子的融合渲染

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2018-11-24 DOI: 10.5772/INTECHOPEN.81191

K. Koyamada, Naohisa Sakamoto

{"title":"Particle-Based Fused Rendering","authors":"K. Koyamada, Naohisa Sakamoto","doi":"10.5772/INTECHOPEN.81191","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.81191","url":null,"abstract":"In this chapter, we propose a fused rendering technique that can integrally handle multiple irregular volumes. Although there is a strong requirement for understanding large-scale datasets generated from coupled simulation techniques such as computational structure mechanics (CSM) and computational fluid dynamics (CFD), there is no fused rendering technique to the best of our knowledge. For this purpose, we can employ the particle-based volume rendering (PBVR) technique for each irregular volume dataset. Since the current PBVR technique regards an irregular cell as a planar footprint during depth evaluation, the straightforward employment causes some artifacts especially at the cell boundaries. To solve the problem, we calculate the depth value based on the assumption that the opacity describes the cumulative distribution function (CDF) of a probability variable, w, which shows a length from the entry point in the fragment interval in the cell. In our experiments, we applied our method to numerical simulation results in which two different irregular grid cells are defined in the same space and confirmed its effectiveness with respect to the image quality.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82772687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0