{"title":"ChplBlamer","authors":"Hui Zhang, Jeffrey K. Hollingsworth","doi":"10.1145/3205289.3205314","DOIUrl":null,"url":null,"abstract":"Parallel programming is hard, and it is even harder to analyze parallel programs and identify specific performance bottlenecks. Chapel is an emerging Partitioned-Global-Address-Space (PGAS) language that provides productive parallel programming. Most established profilers either completely lack the capacity to profile Chapel programs or generate information that cannot provide insightful guidance in a user-level context. To address this issue, we developed ChplBlamer to pinpoint performance losses due to data distribution and remote data accesses. We use a data-centric and code-centric combined approach to help Chapel users quickly identify performance bottlenecks in the source. To demonstrate the utility of ChplBlamer, we studied three multi-locale Chapel benchmarks. For each benchmark, ChplBlamer found the causes of the performance losses. With the optimization guidance provided by ChplBlamer, we significantly improved the performance by up to 4x with little code modification.","PeriodicalId":441217,"journal":{"name":"Proceedings of the 2018 International Conference on Supercomputing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3205289.3205314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Parallel programming is hard, and it is even harder to analyze parallel programs and identify specific performance bottlenecks. Chapel is an emerging Partitioned-Global-Address-Space (PGAS) language that provides productive parallel programming. Most established profilers either completely lack the capacity to profile Chapel programs or generate information that cannot provide insightful guidance in a user-level context. To address this issue, we developed ChplBlamer to pinpoint performance losses due to data distribution and remote data accesses. We use a data-centric and code-centric combined approach to help Chapel users quickly identify performance bottlenecks in the source. To demonstrate the utility of ChplBlamer, we studied three multi-locale Chapel benchmarks. For each benchmark, ChplBlamer found the causes of the performance losses. With the optimization guidance provided by ChplBlamer, we significantly improved the performance by up to 4x with little code modification.