使用 Scalasca/Score-P 和 Paraver/Extrae 工具集进行 15 年以上的联合并行应用性能分析/工具培训

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-08-02 DOI:10.1016/j.future.2024.07.050

{"title":"使用 Scalasca/Score-P 和 Paraver/Extrae 工具集进行 15 年以上的联合并行应用性能分析/工具培训","authors":"","doi":"10.1016/j.future.2024.07.050","DOIUrl":null,"url":null,"abstract":"<div><p>The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.</p><p>Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The <em>Scalasca</em> and <em>Paraver</em> toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24004187/pdfft?md5=6d95fa0157a348afe6a2f74eeb9b1f7c&pid=1-s2.0-S0167739X24004187-main.pdf","citationCount":"0","resultStr":"{\"title\":\"15+ years of joint parallel application performance analysis/tools training with Scalasca/Score-P and Paraver/Extrae toolsets\",\"authors\":\"\",\"doi\":\"10.1016/j.future.2024.07.050\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.</p><p>Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The <em>Scalasca</em> and <em>Paraver</em> toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.</p></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24004187/pdfft?md5=6d95fa0157a348afe6a2f74eeb9b1f7c&pid=1-s2.0-S0167739X24004187-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24004187\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004187","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

为应对最高性能要求的计算挑战，目前可用和正在创建的分布式异构计算机系统种类繁多，这给应用开发人员带来了令人生畏的复杂性。他们必须在多核 CPU 处理器上有效地分解和分配其应用功能和数据，并有效地协调相关的通信和同步，同时在具有不同拓扑结构互连网络的计算节点内结构多个附加加速设备。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

15+ years of joint parallel application performance analysis/tools training with Scalasca/Score-P and Paraver/Extrae toolsets

The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.

Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The Scalasca and Paraver toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.