{"title":"Performance Analysis and Optimization with Little’s Law","authors":"Sanyam Mehta","doi":"10.1109/ispass55109.2022.00002","DOIUrl":null,"url":null,"abstract":"Performance tools are the bridge between processor architecture and a user. However, with the increasingly complex processor architectures, it is becoming increasingly difficult for the users to comprehend the information generated by the performance tools to help diagnose and fix the performance bottlenecks. In addition, the performance tools are themselves limited in many cases. Finally, there is wide variability in the kind of performance counters provided by the different processor vendors, making performance tools unportable across emerging architectures. In this work, we propose to solve these problems by accurately computing a portable and easily comprehensible performance metric - the (Memory-Level Parallelism) MLP of an application. The observed MLP when seen as a fraction of peak theoretical MLP supported by the host processor provides important guidance on the applicability of various popular program optimizations. Six case studies on three different processors each with a different memory technology show that our metric is both effective in program analysis and provides useful guidance on program optimization.","PeriodicalId":115391,"journal":{"name":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ispass55109.2022.00002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Performance tools are the bridge between processor architecture and a user. However, with the increasingly complex processor architectures, it is becoming increasingly difficult for the users to comprehend the information generated by the performance tools to help diagnose and fix the performance bottlenecks. In addition, the performance tools are themselves limited in many cases. Finally, there is wide variability in the kind of performance counters provided by the different processor vendors, making performance tools unportable across emerging architectures. In this work, we propose to solve these problems by accurately computing a portable and easily comprehensible performance metric - the (Memory-Level Parallelism) MLP of an application. The observed MLP when seen as a fraction of peak theoretical MLP supported by the host processor provides important guidance on the applicability of various popular program optimizations. Six case studies on three different processors each with a different memory technology show that our metric is both effective in program analysis and provides useful guidance on program optimization.