2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)最新文献

Performance Trade-offs in GPU Communication: A Study of Host and Device-initiated Approaches GPU通信中的性能权衡:主机和设备启动方法的研究

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00016

Taylor L. Groves, Benjamin Brock, Yuxin Chen, K. Ibrahim, Lenny Oliker, N. Wright, Samuel Williams, K. Yelick

引用次数: 3

The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing fpga在科学计算中的性能和能效潜力

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00007

T. Nguyen, Samuel Williams, Marco Siracusa, Colin MacLean, D. Doerfler, N. Wright

引用次数: 12

[Copyright notice] (版权)

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/pmbs51919.2020.00002

引用次数: 0

Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations 评估NVIDIA A100安培GPU在稀疏和批处理计算中的性能

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00009

H. Anzt, Y. M. Tsai, A. Abdelfattah, T. Cojean, J. Dongarra

引用次数: 9

Benchmarking Julia’s Communication Performance: Is Julia HPC ready or Full HPC? 对Julia的通信性能进行基准测试:Julia HPC准备好了还是完全HPC?

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00008

S. Hunold, Sebastian Steiner

{"title":"Benchmarking Julia’s Communication Performance: Is Julia HPC ready or Full HPC?","authors":"S. Hunold, Sebastian Steiner","doi":"10.1109/PMBS51919.2020.00008","DOIUrl":"https://doi.org/10.1109/PMBS51919.2020.00008","url":null,"abstract":"Julia has quickly become one of the main programming languages for computational sciences, mainly due to its speed and flexibility. The speed and efficiency of Julia are the main reasons why researchers in the field of High Performance Computing have started porting their applications to Julia.Since Julia has a very small binding-overhead to C, many efficient computational kernels can be integrated into Julia without any noticeable performance drop. For that reason, highly tuned libraries, such as the Intel MKL or OpenBLAS, will allow Julia applications to achieve similar computational performance as their C counterparts. Yet, two questions remain: 1) How fast is Julia for memory-bound applications? 2) How efficient can MPI functions be called from a Julia application?In this paper, we will assess the performance of Julia with respect to HPC. To that end, we examine the raw throughput achievable with Julia using a new Julia port of the well-known STREAM benchmark. We also compare the running times of the most commonly used MPI collective operations (e.g., MPI_Allreduce) with their C counterparts. Our analysis shows that HPC performance of Julia is on-par with C in the majority of cases.","PeriodicalId":383727,"journal":{"name":"2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116956595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Lightweight Measurement and Analysis of HPC Performance Variability 高性能计算性能变异性的轻量化测量与分析

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00011

Jered Dominguez-Trujillo, Keira Haskins, S. J. Khouzani, Chris Leap, Sahba Tashakkori, Quincy Wofford, Trilce Estrada, P. Bridges, Patrick M. Widener

{"title":"Lightweight Measurement and Analysis of HPC Performance Variability","authors":"Jered Dominguez-Trujillo, Keira Haskins, S. J. Khouzani, Chris Leap, Sahba Tashakkori, Quincy Wofford, Trilce Estrada, P. Bridges, Patrick M. Widener","doi":"10.1109/PMBS51919.2020.00011","DOIUrl":"https://doi.org/10.1109/PMBS51919.2020.00011","url":null,"abstract":"Performance variation deriving from hardware and software sources is common in modern scientific and data-intensive computing systems, and synchronization in parallel and distributed programs often exacerbates their impacts at scale. The decentralized and emergent effects of such variation are, unfortunately, also difficult to systematically measure, analyze, and predict; modeling assumptions which are stringent enough to make analysis tractable frequently cannot be guaranteed at meaningful application scales, and longitudinal methods at such scales can require the capture and manipulation of impractically large amounts of data. This paper describes a new, scalable, and statistically robust approach for effective modeling, measurement, and analysis of large-scale performance variation in HPC systems. Our approach avoids the need to reason about complex distributions of runtimes among large numbers of individual application processes by focusing instead on the maximum length of distributed workload intervals. We describe this approach and its implementation in MPI which makes it applicable to a diverse set of HPC workloads. We also present evaluations of these techniques for quantifying and predicting performance variation carried out on large-scale computing systems, and discuss the strengths and limitations of the underlying modeling assumptions.","PeriodicalId":383727,"journal":{"name":"2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"374 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124678063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Warwick Data Store: A Data Structure Abstraction Library 华威数据存储:一个数据结构抽象库

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00013

Richard O. Kirk, M. Nolten, R. Kevis, T. Law, S. Maheswaran, Steven A. Wright, S. Powell, G. Mudalige, S. Jarvis

{"title":"Warwick Data Store: A Data Structure Abstraction Library","authors":"Richard O. Kirk, M. Nolten, R. Kevis, T. Law, S. Maheswaran, Steven A. Wright, S. Powell, G. Mudalige, S. Jarvis","doi":"10.1109/PMBS51919.2020.00013","DOIUrl":"https://doi.org/10.1109/PMBS51919.2020.00013","url":null,"abstract":"With the increasing complexity of memory architectures and scientific applications, developing data structures that are performant, portable, scalable, and support developer productivity, is a challenging task. In this paper, we present Warwick Data Store (WDS), a lightweight and extensible C++ template library designed to manage these complexities and allow rapid prototyping. WDS is designed to abstract details of the underlying data structures away from the user, thus easing application development and optimisation. We show that using WDS does not significantly impact achieved performance across a variety of different scientific benchmarks and proxy-applications, compilers, and different architectures. The overheads are largely below 30% for smaller problems, with the overhead deceasing to below 10% when using larger problems. This shows that the library does not significantly impact the performance, while providing additional functionality to data structures, and the ability to optimise data structures without changing the application code.","PeriodicalId":383727,"journal":{"name":"2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122344356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Developing Models for the Runtime of Programs With Exponential Runtime Behavior 开发具有指数运行时行为的程序运行时模型

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-11-01 DOI: 10.1109/PMBS51919.2020.00015

Michael Burger, Giang Nam Nguyen, C. Bischof

{"title":"Developing Models for the Runtime of Programs With Exponential Runtime Behavior","authors":"Michael Burger, Giang Nam Nguyen, C. Bischof","doi":"10.1109/PMBS51919.2020.00015","DOIUrl":"https://doi.org/10.1109/PMBS51919.2020.00015","url":null,"abstract":"In this paper, we present a new approach to generate runtime models for programs whose runtime grows exponentially with the value of one input parameter. Such programs are, e.g., of high interest for cryptanalysis to analyze practical security of traditional and post-quantum secure schemes. The model generation approach on the base of profiled training runs is built on ideas realized in the open source tool Extra-P, extended with a new class of model functions and a shared-memory parallel simulated annealing approach to heuristically determine coefficients for the model functions. Our approach is implemented in the open source software SimAnMo (Simulated Annealing Modeler). We demonstrate on various theoretical and synthetic, practical test cases that our approach delivers very accurate models and reliable predictions, compared to standard approaches on x86 and ARM architectures. SimAnMo is also employed to generate models of four codes which are employed to solve the so-called shortest vector problem. This is an important problem from the field of lattice-based cryptography. We demonstrate the quality of our models with measurements for higher lattice dimensions, as far as it is feasible. Additionally, we highlight inherent problems with models for algorithms with exponential runtime.","PeriodicalId":383727,"journal":{"name":"2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116067413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX A64FX上流核和稀疏矩阵向量乘法的性能建模

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2020-09-29 DOI: 10.1109/PMBS51919.2020.00006

C. Alappat, Jan Laukemann, T. Gruber, G. Hager, G. Wellein, N. Meyer, T. Wettig

引用次数: 14

Message from the Workshop Chairs 来自研讨会主席的信息

2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) Pub Date : 2019-11-01 DOI: 10.1109/AIDM.2006.11

K. Ong, K. Smith‐Miles, Vincent C. S. Lee, W. Ng

引用次数: 0