基于任务图的并行实时性能分析方法

IF 2.1 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Parallel Computing Pub Date : 2023-09-14 DOI:10.1016/j.parco.2023.103050

Matthias Bolten, Stephanie Friedhoff, Jens Hahne

{"title":"基于任务图的并行实时性能分析方法","authors":"Matthias Bolten, Stephanie Friedhoff, Jens Hahne","doi":"10.1016/j.parco.2023.103050","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we present a performance model based on task graphs for various iterative parallel-in-time (PinT) methods. PinT methods have been developed to speed up the simulation time of time-dependent problems using modern parallel supercomputers<span>. The performance model is based on a data-driven notation of the methods, from which a task graph is generated. Based on this task graph and a distribution of time points across processes typical for PinT methods, a theoretical lower runtime bound for the method can be obtained, as well as a prediction of the runtime for a given number of processes. In particular, the model is able to cover the large parameter space of PinT methods and make predictions for arbitrary parameter settings. Here, we describe a general procedure for generating task graphs based on three iterative PinT methods, namely, Parareal, multigrid-reduction-in-time (MGRIT), and the parallel full approximation scheme in space and time (PFASST). Furthermore, we discuss how these task graphs can be used to analyze the performance of the methods. In addition, we compare the predictions of the model with parallel simulation times using five different PinT libraries.</span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"118 ","pages":"Article 103050"},"PeriodicalIF":2.1000,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Task graph-based performance analysis of parallel-in-time methods\",\"authors\":\"Matthias Bolten, Stephanie Friedhoff, Jens Hahne\",\"doi\":\"10.1016/j.parco.2023.103050\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, we present a performance model based on task graphs for various iterative parallel-in-time (PinT) methods. PinT methods have been developed to speed up the simulation time of time-dependent problems using modern parallel supercomputers<span>. The performance model is based on a data-driven notation of the methods, from which a task graph is generated. Based on this task graph and a distribution of time points across processes typical for PinT methods, a theoretical lower runtime bound for the method can be obtained, as well as a prediction of the runtime for a given number of processes. In particular, the model is able to cover the large parameter space of PinT methods and make predictions for arbitrary parameter settings. Here, we describe a general procedure for generating task graphs based on three iterative PinT methods, namely, Parareal, multigrid-reduction-in-time (MGRIT), and the parallel full approximation scheme in space and time (PFASST). Furthermore, we discuss how these task graphs can be used to analyze the performance of the methods. In addition, we compare the predictions of the model with parallel simulation times using five different PinT libraries.</span></p></div>\",\"PeriodicalId\":54642,\"journal\":{\"name\":\"Parallel Computing\",\"volume\":\"118 \",\"pages\":\"Article 103050\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Parallel Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016781912300056X\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016781912300056X","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一个基于任务图的各种迭代并行实时(PinT)方法的性能模型。在现代并行超级计算机上，为了加快时间相关问题的模拟速度，发展了PinT方法。性能模型基于方法的数据驱动表示法，从中生成任务图。基于此任务图和典型的PinT方法的跨进程时间点分布，可以获得该方法的理论运行时下限，以及对给定数量进程的运行时的预测。特别是，该模型能够覆盖PinT方法的大参数空间，并对任意参数设置进行预测。在这里，我们描述了一种基于三种迭代的PinT方法生成任务图的一般过程，即Parareal, multi - grid-reduction-in-time (MGRIT)和parallel full approximation in space and time (PFASST)。此外，我们还讨论了如何使用这些任务图来分析方法的性能。此外，我们使用五个不同的PinT库将模型的预测与并行模拟时间进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Task graph-based performance analysis of parallel-in-time methods

In this paper, we present a performance model based on task graphs for various iterative parallel-in-time (PinT) methods. PinT methods have been developed to speed up the simulation time of time-dependent problems using modern parallel supercomputers. The performance model is based on a data-driven notation of the methods, from which a task graph is generated. Based on this task graph and a distribution of time points across processes typical for PinT methods, a theoretical lower runtime bound for the method can be obtained, as well as a prediction of the runtime for a given number of processes. In particular, the model is able to cover the large parameter space of PinT methods and make predictions for arbitrary parameter settings. Here, we describe a general procedure for generating task graphs based on three iterative PinT methods, namely, Parareal, multigrid-reduction-in-time (MGRIT), and the parallel full approximation scheme in space and time (PFASST). Furthermore, we discuss how these task graphs can be used to analyze the performance of the methods. In addition, we compare the predictions of the model with parallel simulation times using five different PinT libraries.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Parallel Computing 工程技术-计算机：理论方法

CiteScore

3.50

自引率

7.10%

发文量

审稿时长

4.5 months

期刊介绍： Parallel Computing is an international journal presenting the practical use of parallel computer systems, including high performance architecture, system software, programming systems and tools, and applications. Within this context the journal covers all aspects of high-end parallel computing from single homogeneous or heterogenous computing nodes to large-scale multi-node systems. Parallel Computing features original research work and review articles as well as novel or illustrative accounts of application experience with (and techniques for) the use of parallel computers. We also welcome studies reproducing prior publications that either confirm or disprove prior published results. Particular technical areas of interest include, but are not limited to: -System software for parallel computer systems including programming languages (new languages as well as compilation techniques), operating systems (including middleware), and resource management (scheduling and load-balancing). -Enabling software including debuggers, performance tools, and system and numeric libraries. -General hardware (architecture) concepts, new technologies enabling the realization of such new concepts, and details of commercially available systems -Software engineering and productivity as it relates to parallel computing -Applications (including scientific computing, deep learning, machine learning) or tool case studies demonstrating novel ways to achieve parallelism -Performance measurement results on state-of-the-art systems -Approaches to effectively utilize large-scale parallel computing including new algorithms or algorithm analysis with demonstrated relevance to real applications using existing or next generation parallel computer architectures. -Parallel I/O systems both hardware and software -Networking technology for support of high-speed computing demonstrating the impact of high-speed computation on parallel applications