Rapid identification of architectural bottlenecks via precise event counting

J. Demme, S. Sethumadhavan
{"title":"Rapid identification of architectural bottlenecks via precise event counting","authors":"J. Demme, S. Sethumadhavan","doi":"10.1145/2000064.2000107","DOIUrl":null,"url":null,"abstract":"On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research.","PeriodicalId":340732,"journal":{"name":"2011 38th Annual International Symposium on Computer Architecture (ISCA)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"74","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 38th Annual International Symposium on Computer Architecture (ISCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2000064.2000107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 74

Abstract

On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research.
通过精确的事件计数快速识别架构瓶颈
片上性能计数器在计算机体系结构研究中起着至关重要的作用,因为它们能够快速提供对应用程序行为的洞察,而传统方法需要花费大量时间来表征这些行为。然而,现代性能计数器的有用性受到目前使用的低效技术的限制。当前的访问技术依赖于不精确的采样或重量级内核交互,迫使用户在精度或速度之间做出选择,从而限制了性能计数器硬件的使用。在本文中,我们描述了新的方法,使精确,轻量级接口芯片上的性能计数器。这些低开销技术允许在几十纳秒内精确读取虚拟计数器,这比当前的访问技术快一到两个数量级。此外,这些工具提供了一些关于现代并行程序(如MySQL和Firefox)行为的新见解,这些见解以前是通过现有的表征方法模糊(或不可能获得)的。基于我们的新访问方法的案例研究,我们讨论了云时代计算机架构师的七个含义以及进一步增强硬件计数器的三种方法。总的来说,这些观察结果有可能为建筑研究开辟新的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信