{"title":"Understanding the Performance of GPGPU Applications from a Data-Centric View","authors":"Hui Zhang, J. Hollingsworth","doi":"10.1109/ProTools49597.2019.00006","DOIUrl":"https://doi.org/10.1109/ProTools49597.2019.00006","url":null,"abstract":"Using a CPU-GPU hybrid computing framework is becoming a common configuration for supercomputers. The wide deployment of GPUs (as well as other hardware accelerators) brings to the HPC community a big question: Are we using them effectively? Inappropriate use of GPUs can generate incorrect results in certain cases, but more often, will slow down the program instead of speeding it up. This paper describes a tool that satisfies the needs of programmers to analyze the runtime performance of kernels and obtain insights for better GPU utilization. Compared to existing GPU performance tools, ours provides some unique features: data-centric profiling and generating complete GPU call stacks. With the guidance of the tool, we were able to improve the kernel performance of three widely-studied GPU benchmarks by a factor of up to 46.6x with minor code modification.","PeriodicalId":418029,"journal":{"name":"2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126780602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message from the ProTools 2019 Workshop Chairs","authors":"","doi":"10.1109/protools49597.2019.00004","DOIUrl":"https://doi.org/10.1109/protools49597.2019.00004","url":null,"abstract":"Understanding program behavior is critical to overcome the expected architectural and programming complexities that arise on modern high performance computing (HPC) platforms, such as limited power budgets, heterogeneity, hierarchical memories, shrinking I/O bandwidths, and performance variability. In order to do so, HPC software developers need intuitive tools for debugging, performance measurement, analysis, and tuning of large-scale HPC applications. Moreover, data collected from these tools such as hardware counters, communication traces, and network traffic can be far too large and too complex to be analyzed in a straightforward manner. We need new automatic analysis and visualization approaches to help application developers intuitively understand the multiple, interdependent effects that algorithmic choices have on application correctness or performance. The Workshop on Programming and Performance Visualization Tools (ProTools) intends to bring together HPC application developers, tool developers, and researchers from the visualization, performance, and program analysis fields for an exchange of new approaches to assist developers in analyzing, understanding, and optimizing programs for extreme-scale platforms.","PeriodicalId":418029,"journal":{"name":"2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools)","volume":"779 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134549520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Islam, Alexis Ayala, Quentin Jensen, K. Ibrahim
{"title":"Toward a Programmable Analysis and Visualization Framework for Interactive Performance Analytics","authors":"T. Islam, Alexis Ayala, Quentin Jensen, K. Ibrahim","doi":"10.1109/ProTools49597.2019.00015","DOIUrl":"https://doi.org/10.1109/ProTools49597.2019.00015","url":null,"abstract":"Understanding the performance characteristics of applications in modern HPC environments is becoming more challenging due to the increase in the architectural and programming complexities. HPC software developers rely on sources such as hardware counters and event traces to infer performance problems while focusing on designing mitigation strategies. A large number of in-house tools exist in the community, which indicate replicated effort. This paper presents a customizable framework for analyzing performance measurements and visualizing through a web-based interactive dashboard for interactively exploring a large volume of hierarchical information. In this paper, we analyze three ECP applications as use cases and identify as well as optimize problematic resource utilization behaviors exposed by our visualizations. This framework is a step towards a unified platform for visual identification of performance scaling bottlenecks to ease the collaboration between application developers, performance analysts, and hardware vendors.","PeriodicalId":418029,"journal":{"name":"2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131938269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}