统一硬件计数器指标的性能分析

B. Gravelle, W. Nystrom, B. Norris
{"title":"统一硬件计数器指标的性能分析","authors":"B. Gravelle, W. Nystrom, B. Norris","doi":"10.1109/PMBS56514.2022.00011","DOIUrl":null,"url":null,"abstract":"Hardware performance counters provide detailed insight into the performance of applications running on modern systems, but they can be challenging to use without detailed knowledge of the computational and counter architectures. Our work addresses this challenge by identifying metrics that are common to many micro-architectures and can be directly related to the algorithms in question. These metrics, some long used and some being presented for the first time, are carefully designed to be easy to follow, informative, and portable to multiple systems. In this paper, we discuss the background of empirical performance analysis, describe our set of metrics, and demonstrate analysis on example benchmarks and mini-applications. The metrics and examples are presented on both an Intel Xeon Cascade Lake and an ARM-based Fujitsu A64FX. The significant differences in the ISAs, caches and hardware counters between these two systems demonstrate the portability of the proposed metrics. 1","PeriodicalId":321991,"journal":{"name":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Analysis with Unified Hardware Counter Metrics\",\"authors\":\"B. Gravelle, W. Nystrom, B. Norris\",\"doi\":\"10.1109/PMBS56514.2022.00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hardware performance counters provide detailed insight into the performance of applications running on modern systems, but they can be challenging to use without detailed knowledge of the computational and counter architectures. Our work addresses this challenge by identifying metrics that are common to many micro-architectures and can be directly related to the algorithms in question. These metrics, some long used and some being presented for the first time, are carefully designed to be easy to follow, informative, and portable to multiple systems. In this paper, we discuss the background of empirical performance analysis, describe our set of metrics, and demonstrate analysis on example benchmarks and mini-applications. The metrics and examples are presented on both an Intel Xeon Cascade Lake and an ARM-based Fujitsu A64FX. The significant differences in the ISAs, caches and hardware counters between these two systems demonstrate the portability of the proposed metrics. 1\",\"PeriodicalId\":321991,\"journal\":{\"name\":\"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PMBS56514.2022.00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PMBS56514.2022.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

硬件性能计数器提供了对在现代系统上运行的应用程序性能的详细了解,但是如果不了解计算和计数器体系结构的详细知识,使用它们可能会很困难。我们的工作通过识别许多微体系结构中常见的指标来解决这一挑战,这些指标可以直接与所讨论的算法相关。这些指标,有些是长期使用的,有些是第一次提出的,都经过精心设计,易于遵循,信息丰富,并且可移植到多个系统。在本文中,我们讨论了实证性能分析的背景,描述了我们的指标集,并在示例基准测试和小型应用程序上演示了分析。在Intel Xeon Cascade Lake和基于arm的Fujitsu A64FX上给出了度量和示例。这两个系统在isa、缓存和硬件计数器方面的显著差异证明了所建议指标的可移植性。1
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Analysis with Unified Hardware Counter Metrics
Hardware performance counters provide detailed insight into the performance of applications running on modern systems, but they can be challenging to use without detailed knowledge of the computational and counter architectures. Our work addresses this challenge by identifying metrics that are common to many micro-architectures and can be directly related to the algorithms in question. These metrics, some long used and some being presented for the first time, are carefully designed to be easy to follow, informative, and portable to multiple systems. In this paper, we discuss the background of empirical performance analysis, describe our set of metrics, and demonstrate analysis on example benchmarks and mini-applications. The metrics and examples are presented on both an Intel Xeon Cascade Lake and an ARM-based Fujitsu A64FX. The significant differences in the ISAs, caches and hardware counters between these two systems demonstrate the portability of the proposed metrics. 1
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信