{"title":"Frontier vs the Exascale Report: Why so long? and Are We Really There Yet?","authors":"P. Kogge, W. Dally","doi":"10.1109/PMBS56514.2022.00008","DOIUrl":"https://doi.org/10.1109/PMBS56514.2022.00008","url":null,"abstract":"Now that the exascale Frontier is here, it is instructive to compare its properties to those projected in the 2008 Exascale technology report, and ask what’s different, why did it seemingly take so long to get here, and what lessons should we take away for future machines. This article discusses these points from the aspects of the original ground rules for the Exascale report, the technologies involved, and the changes in architecture and microarchitecture. It also starts a discussion of zettascale systems.","PeriodicalId":321991,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127708770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Initial Evaluation of Arm’s Scalable Matrix Extension","authors":"Finn Wilkinson, Simon McIntosh-Smith","doi":"10.1109/PMBS56514.2022.00018","DOIUrl":"https://doi.org/10.1109/PMBS56514.2022.00018","url":null,"abstract":"Expanding upon their Scalable Vector Extension (SVE), Arm have introduced the Scalable Matrix Extension (SME) to improve in-core performance for matrix operations such as matrix multiplication. With the lack of hardware and cycle-accurate simulations available which supports SME, it is unclear how effective this new instruction set extension will be, and for what type of applications it will provide the most benefit.By adapting The Simulation Engine (SimEng) from the University of Bristol’s High Performance Computing Group to support SME, we aim to compare the simulated performance of a Fujitsu A64FX core (with native SVE support) to a like-for-like hypothetical core with added SME support. By simulating a wide range of Streaming Vector Lengths for our hypothetical SME core model, we provide and discuss first-of-a-kind results for an SME implementation, before discussing future work that will be carried out to further evaluate the suitability of SME.","PeriodicalId":321991,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132900535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AppEKG: A Simple Unifying View of HPC Applications in Production","authors":"M. Al-Tahat, Strahinja Trecakov, J. Cook","doi":"10.1109/PMBS56514.2022.00017","DOIUrl":"https://doi.org/10.1109/PMBS56514.2022.00017","url":null,"abstract":"While many good development-oriented tools exist for analyzing and improving the performance of HPC applications, capability for capturing and analyzing the dynamic behavior of application in real production runs is lacking. Many heavily-used applications do keep some internal metrics of their performance, but there is no unified way of using these. In this paper we present the initial idea of AppEKG, both a concept of and a prototype tool for providing a unified, understandable view of HPC application behavior in production. Our prototype AppEKG framework can achieve less than 1% overhead, thus usable in production, and still provide dynamic data collection that captures time-varying runtime behavior.","PeriodicalId":321991,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126326163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Coleman, H. Casanova, K. Maheshwari, L. Pottier, Sean R. Wilkinson, J. Wozniak, F. Suter, M. Shankar, Rafael Ferreira da Silva
{"title":"WfBench: Automated Generation of Scientific Workflow Benchmarks","authors":"T. Coleman, H. Casanova, K. Maheshwari, L. Pottier, Sean R. Wilkinson, J. Wozniak, F. Suter, M. Shankar, Rafael Ferreira da Silva","doi":"10.1109/PMBS56514.2022.00014","DOIUrl":"https://doi.org/10.1109/PMBS56514.2022.00014","url":null,"abstract":"The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the deployment, monitoring, and optimization of workflow executions, many workflow systems have been developed over the past decade. There is a need for workflow benchmarks that can be used to evaluate the performance of workflow systems on current and future software stacks and hardware platforms.We present a generator of realistic workflow benchmark specifications that can be translated into benchmark code to be executed with current workflow systems. Our approach generates workflow tasks with arbitrary performance characteristics (CPU, memory, and I/O usage) and with realistic task dependency structures based on those seen in production workflows. We present experimental results that show that our approach generates benchmarks that are representative of production workflows, and conduct a case study to demonstrate the use and usefulness of our generated benchmarks to evaluate the performance of workflow systems under different configuration scenarios.","PeriodicalId":321991,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"58 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113943206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}