{"title":"Self-monitoring overhead of the Linux perf_ event performance counter interface","authors":"Vincent M. Weaver","doi":"10.1109/ISPASS.2015.7095789","DOIUrl":null,"url":null,"abstract":"Most modern CPUs include hardware performance counters: architectural registers that allow programmers to gain low-level insight into system performance. Low-overhead access to these counters is necessary for accurate performance analysis, making the operating system interface critical to providing low-latency performance data. We investigate the overhead of self-monitoring performance counter measurements on the Linux perf_event interface. We find that default code (such as that used by PAPI) implementing the perf_event self-monitoring interface can have large overhead: up to an order of magnitude larger than the previously used perfctr and perfmon2 performance counter implementations. We investigate the causes of this overhead and find that with proper coding this overhead can be greatly reduced on recent Linux kernels.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"78 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39
Abstract
Most modern CPUs include hardware performance counters: architectural registers that allow programmers to gain low-level insight into system performance. Low-overhead access to these counters is necessary for accurate performance analysis, making the operating system interface critical to providing low-latency performance data. We investigate the overhead of self-monitoring performance counter measurements on the Linux perf_event interface. We find that default code (such as that used by PAPI) implementing the perf_event self-monitoring interface can have large overhead: up to an order of magnitude larger than the previously used perfctr and perfmon2 performance counter implementations. We investigate the causes of this overhead and find that with proper coding this overhead can be greatly reduced on recent Linux kernels.