Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev
{"title":"在Intel Advisor中使用缓存感知的rooline模型进行性能分析","authors":"Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev","doi":"10.1109/HPCS.2017.150","DOIUrl":null,"url":null,"abstract":"The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"254 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Performance Analysis with Cache-Aware Roofline Model in Intel Advisor\",\"authors\":\"Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev\",\"doi\":\"10.1109/HPCS.2017.150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.\",\"PeriodicalId\":115758,\"journal\":{\"name\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"254 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS.2017.150\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2017.150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Analysis with Cache-Aware Roofline Model in Intel Advisor
The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.