Jennifer M. Anderson, Lance M. Berc, J. Dean, S. Ghemawat, Monika Henzinger, Shun-Tak Leung, R. L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, W. Weihl
{"title":"Continuous profiling: where have all the cycles gone?","authors":"Jennifer M. Anderson, Lance M. Berc, J. Dean, S. Ghemawat, Monika Henzinger, Shun-Tak Leung, R. L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, W. Weihl","doi":"10.1145/268998.266637","DOIUrl":null,"url":null,"abstract":"This paper describes the DIGlTAL Continuous Profiling Infrastmcture, a sampling-based profiling system designed to run continuously on production systems. The system supports multiprocessors, works on unmodified executable& and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel. Samples are collected at a high rate (over 5200 samples/secper333-MHz processor), yet with low overhead (l-3% slowdown for most workloads). Analysis tools supplied with the profiling system use the sample data to produce an accurate accounting, down to the level of pipeline stalls incurred by individual instructions, of where time is being spent. When instructions incur stalls, the tools identify possible reasons, such as cache misses, branch mispredictions, and functional unit contention. The fine-grained instruction-level analysis guides users and automated optimizers to the causes of performance problems and provides important insights for fixing them.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the sixteenth ACM symposium on Operating systems principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/268998.266637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36
Abstract
This paper describes the DIGlTAL Continuous Profiling Infrastmcture, a sampling-based profiling system designed to run continuously on production systems. The system supports multiprocessors, works on unmodified executable& and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel. Samples are collected at a high rate (over 5200 samples/secper333-MHz processor), yet with low overhead (l-3% slowdown for most workloads). Analysis tools supplied with the profiling system use the sample data to produce an accurate accounting, down to the level of pipeline stalls incurred by individual instructions, of where time is being spent. When instructions incur stalls, the tools identify possible reasons, such as cache misses, branch mispredictions, and functional unit contention. The fine-grained instruction-level analysis guides users and automated optimizers to the causes of performance problems and provides important insights for fixing them.