Sequential Task Flow Runtime Model Improvements and Limitations

2022 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS) Pub Date : 2022-11-01 DOI:10.1109/ROSS56639.2022.00009

Yu Pei, G. Bosilca, J. Dongarra

{"title":"Sequential Task Flow Runtime Model Improvements and Limitations","authors":"Yu Pei, G. Bosilca, J. Dongarra","doi":"10.1109/ROSS56639.2022.00009","DOIUrl":null,"url":null,"abstract":"The sequential task flow (STF) model is the main-stream approach for interacting with task-based runtime systems, with StarPU and the Dynamic task discovery (DTD) in PaRSEC being two implementations of this model. Compared with other approaches of submitting tasks into a runtime system, STF has interesting advantages centered around an easy-to-use API, that allows users to expressed algorithms as a sequence of tasks (much like in OpenMP), while allowing the runtime to automatically identify and analyze the task dependencies and scheduling. In this paper, we focus on the DTD interface in PaRSEC, highlight some of its lesser known limitations and implemented two optimization techniques for DTD: support for user level graph trimming, and a new API for broadcast read-only data to remote tasks. We then analyze the benefits and limitations of these optimizations with benchmarks as well as on two common matrix factorization kernels Cholesky and QR, on two different systems Shaheen II from KAUST and Fugaku from RIKEN. We point out some potential for further improvements, and provided valuable insights into the strength and weakness of STF model. hoping to guide the future developments of task-based runtime systems.","PeriodicalId":226739,"journal":{"name":"2022 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROSS56639.2022.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The sequential task flow (STF) model is the main-stream approach for interacting with task-based runtime systems, with StarPU and the Dynamic task discovery (DTD) in PaRSEC being two implementations of this model. Compared with other approaches of submitting tasks into a runtime system, STF has interesting advantages centered around an easy-to-use API, that allows users to expressed algorithms as a sequence of tasks (much like in OpenMP), while allowing the runtime to automatically identify and analyze the task dependencies and scheduling. In this paper, we focus on the DTD interface in PaRSEC, highlight some of its lesser known limitations and implemented two optimization techniques for DTD: support for user level graph trimming, and a new API for broadcast read-only data to remote tasks. We then analyze the benefits and limitations of these optimizations with benchmarks as well as on two common matrix factorization kernels Cholesky and QR, on two different systems Shaheen II from KAUST and Fugaku from RIKEN. We point out some potential for further improvements, and provided valuable insights into the strength and weakness of STF model. hoping to guide the future developments of task-based runtime systems.

查看原文本刊更多论文

顺序任务流运行时模型的改进和限制

顺序任务流(STF)模型是与基于任务的运行时系统交互的主流方法，其中StarPU和PaRSEC中的动态任务发现(DTD)是该模型的两种实现。与其他将任务提交到运行时系统的方法相比，STF有一个有趣的优势，主要集中在易于使用的API上，它允许用户将算法表示为一系列任务(很像OpenMP)，同时允许运行时自动识别和分析任务依赖关系和调度。在本文中，我们重点关注PaRSEC中的DTD接口，强调其一些鲜为人知的限制，并实现了两种DTD优化技术:支持用户级图修剪，以及用于向远程任务广播只读数据的新API。然后，我们通过基准测试以及两个常见的矩阵分解内核Cholesky和QR，在KAUST的Shaheen II和RIKEN的Fugaku两个不同的系统上分析了这些优化的优点和局限性。我们指出了一些进一步改进的潜力，并对STF模型的优缺点提供了有价值的见解。希望能够指导基于任务的运行时系统的未来发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)

自引率

0.00%

发文量