Minimal Counters, Maximum Insight: Simplifying System Performance With HPC Clusters for Optimized Monitoring

IF 1.4 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Shubhi Shukla;Abhijeet Singh;Rajdeep Chakraborty;Anirban Chakraborty;Tejas Rathod;Harshal Mumbaikar;Manoj Kumar Munigala;Madhusudhan K N;Pabitra Mitra;Debdeep Mukhopadhyay
{"title":"Minimal Counters, Maximum Insight: Simplifying System Performance With HPC Clusters for Optimized Monitoring","authors":"Shubhi Shukla;Abhijeet Singh;Rajdeep Chakraborty;Anirban Chakraborty;Tejas Rathod;Harshal Mumbaikar;Manoj Kumar Munigala;Madhusudhan K N;Pabitra Mitra;Debdeep Mukhopadhyay","doi":"10.1109/LCA.2025.3570157","DOIUrl":null,"url":null,"abstract":"As computer systems become more complex, evaluating performance requires tracking various hardware performance counters that capture the system’s internal activities. While these counters provide valuable insights, their growing number makes it challenging to identify the most relevant ones for performance analysis. In this paper, we investigate the correlation between performance counter values and overall system performance, while also exploring the inter-correlation between different counters. Our findings demonstrate that specific counters are strongly correlated with key performance metrics and that significant redundancy exists among counters. By leveraging these relationships, we propose a method for selecting a small, representative set of performance counters. This streamlined set can further be used to accurately predict performance score across various workloads and system configurations.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"24 1","pages":"177-180"},"PeriodicalIF":1.4000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11003568/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

As computer systems become more complex, evaluating performance requires tracking various hardware performance counters that capture the system’s internal activities. While these counters provide valuable insights, their growing number makes it challenging to identify the most relevant ones for performance analysis. In this paper, we investigate the correlation between performance counter values and overall system performance, while also exploring the inter-correlation between different counters. Our findings demonstrate that specific counters are strongly correlated with key performance metrics and that significant redundancy exists among counters. By leveraging these relationships, we propose a method for selecting a small, representative set of performance counters. This streamlined set can further be used to accurately predict performance score across various workloads and system configurations.
最小的计数器,最大的洞察力:简化系统性能与高性能计算集群优化监控
随着计算机系统变得越来越复杂,评估性能需要跟踪捕捉系统内部活动的各种硬件性能计数器。虽然这些计数器提供了有价值的见解,但它们的数量不断增加,使得确定与性能分析最相关的计数器变得具有挑战性。在本文中,我们研究了性能计数器值与整体系统性能之间的相关性,同时也探索了不同计数器之间的相互关系。我们的研究结果表明,特定计数器与关键性能指标密切相关,并且计数器之间存在显着冗余。通过利用这些关系,我们提出了一种选择一个小的、有代表性的性能计数器集的方法。这个简化的集合可以进一步用于准确预测跨各种工作负载和系统配置的性能得分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Computer Architecture Letters
IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-
CiteScore
4.60
自引率
4.30%
发文量
29
期刊介绍: IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信