在Intel Advisor中使用缓存感知的rooline模型进行性能分析

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI:10.1109/HPCS.2017.150

Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev

{"title":"在Intel Advisor中使用缓存感知的rooline模型进行性能分析","authors":"Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev","doi":"10.1109/HPCS.2017.150","DOIUrl":null,"url":null,"abstract":"The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"254 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Performance Analysis with Cache-Aware Roofline Model in Intel Advisor\",\"authors\":\"Diogo Marques, Helder Duarte, A. Ilic, L. Sousa, Roman Belenov, P. Thierry, Zakhar A. Matveev\",\"doi\":\"10.1109/HPCS.2017.150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.\",\"PeriodicalId\":115758,\"journal\":{\"name\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"254 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS.2017.150\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2017.150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

最近处理器体系结构复杂性的增加给设计和优化实际应用程序的执行带来了重大挑战，即使是在通用硬件上也是如此。为了在这个过程中提供帮助，对架构和应用程序执行瓶颈进行快速和深刻可视化的工具对计算机架构师和应用程序工程师特别有用，例如最近提出的缓存感知的rooline模型(CARM)。CARM代表了一种深刻的架构性能模型，它提供了一种简单直观的方式来直观地表示具有复杂内存层次结构的当代多核处理器上并行处理的限制。在最近的更新中，英特尔顾问将性能CARM集成到其工作流程中。Intel Advisor是一个强大的工具，通过分析应用程序并提供有关并行化、向量化和内存访问改进的提示，可以帮助应用程序开发人员从处理器架构中提取全部潜在性能。因此，当与CARM相结合时，英特尔Advisor rooline代表了一个完整的分析和可视化框架，用于应用程序的表征、优化和开发。本文重点介绍了英特尔Advisor中的CARM分析方法，并展示了其他Advisor功能的可用性。为此，我们在最先进的硬件平台上分析了来自不同基准套件的一组10个应用程序，以发现最关键的瓶颈和可能的优化步骤来克服它们。通过遵循Intel Advisor rooline给出的优化指南，与未优化的版本相比，几个应用程序内核的性能提高了6.43倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Analysis with Cache-Aware Roofline Model in Intel Advisor

The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers, such as the recently proposed Cache-aware Roofline Model (CARM). CARM represents an insightful architecture performance model that provides a simple and intuitive way of visually representing the limits of parallel processing on contemporary multi-core processors with complex memory hierarchy. In its recent updates, Intel Advisor integrated performance CARM into its workflow. Intel Advisor is a powerful tool that helps application developers to extract the full potential performance out of a processor architecture, by analyzing applications and providing hints on parallelization, vectorization and memory access improvements. Therefore, when coupled with CARM, Intel Advisor Roofline represents a complete analysis and visualization framework for application characterization, optimization, and development. This paper focuses on introducing the CARM analysis methodology within Intel Advisor, by also showcasing the usability of other Advisor features. For this purpose, a set of 10 applications from different benchmark suits were analyzed on the state-of-the-art hardware platform in order to uncover the most critical bottlenecks and possible optimization steps to overcome them. By following the optimization guidelines given by Intel Advisor Roofline, the performance of several application kernels was improved for up to 6.43 times when compared to the unoptimized versions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量