Accelerated execution via eager-release of dependencies in task-based workflows

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Hatem Elshazly, F. Lordan, J. Ejarque, R. Badia
{"title":"Accelerated execution via eager-release of dependencies in task-based workflows","authors":"Hatem Elshazly, F. Lordan, J. Ejarque, R. Badia","doi":"10.1177/1094342021997558","DOIUrl":null,"url":null,"abstract":"Task-based programming models offer a flexible way to express the unstructured parallelism patterns of nowadays complex applications. This expressive capability is required to achieve maximum possible performance for applications that are executed in distributed execution platforms. In current task-based workflows, tasks are launched for execution when their data dependencies are satisfied. However, even though the data dependencies of a certain task might have been already produced, the execution of this task will be delayed until its predecessor tasks completely finish their execution. As a consequence of this approach of releasing dependencies, the amount of parallelism inherent in applications is limited and performance improvement opportunities are wasted. To mitigate this limitation, we propose an eager approach for releasing data dependencies. Following this approach, the execution of tasks will not be delayed until their predecessor tasks completely finish their execution, instead, tasks will be launched for execution as soon as their data requirements are available. Hence, more parallelism is exposed and applications can achieve higher levels of performance by overlapping the execution of tasks. Towards achieving this goal, in this paper we propose applying two changes to task-based workflow systems. First, modifying the dependency relationships of tasks to be specified not only in terms of predecessor and successor tasks but also in terms of the data that caused these dependencies. Second, triggering the release of dependencies as soon as a predecessor task generates the output data instead of having to wait until the end of the predecessor execution to release all of its dependencies. We realize this proposal using PyCOMPSs: a task-based programming model for parallelizing Python applications. Our experiments show that using an eager approach for releasing dependencies achieves more than 50% performance improvement in the total execution time as compared to the default approach of releasing dependencies.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"35 1","pages":"325 - 343"},"PeriodicalIF":3.5000,"publicationDate":"2021-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1094342021997558","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of High Performance Computing Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/1094342021997558","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 2

Abstract

Task-based programming models offer a flexible way to express the unstructured parallelism patterns of nowadays complex applications. This expressive capability is required to achieve maximum possible performance for applications that are executed in distributed execution platforms. In current task-based workflows, tasks are launched for execution when their data dependencies are satisfied. However, even though the data dependencies of a certain task might have been already produced, the execution of this task will be delayed until its predecessor tasks completely finish their execution. As a consequence of this approach of releasing dependencies, the amount of parallelism inherent in applications is limited and performance improvement opportunities are wasted. To mitigate this limitation, we propose an eager approach for releasing data dependencies. Following this approach, the execution of tasks will not be delayed until their predecessor tasks completely finish their execution, instead, tasks will be launched for execution as soon as their data requirements are available. Hence, more parallelism is exposed and applications can achieve higher levels of performance by overlapping the execution of tasks. Towards achieving this goal, in this paper we propose applying two changes to task-based workflow systems. First, modifying the dependency relationships of tasks to be specified not only in terms of predecessor and successor tasks but also in terms of the data that caused these dependencies. Second, triggering the release of dependencies as soon as a predecessor task generates the output data instead of having to wait until the end of the predecessor execution to release all of its dependencies. We realize this proposal using PyCOMPSs: a task-based programming model for parallelizing Python applications. Our experiments show that using an eager approach for releasing dependencies achieves more than 50% performance improvement in the total execution time as compared to the default approach of releasing dependencies.
通过在基于任务的工作流中快速释放依赖来加速执行
基于任务的编程模型提供了一种灵活的方式来表达当今复杂应用程序的非结构化并行模式。对于在分布式执行平台中执行的应用程序,需要这种表达能力来实现最大可能的性能。在当前基于任务的工作流中,当任务的数据依赖性得到满足时,任务就会启动并执行。然而,即使某个任务的数据依赖关系可能已经产生,这个任务的执行将被延迟,直到它的前一个任务完全完成它们的执行。这种释放依赖关系的方法的结果是,应用程序中固有的并行性受到限制,并且浪费了性能改进的机会。为了减轻这种限制,我们提出了一种急于释放数据依赖的方法。通过这种方法,任务的执行不会延迟到其前任任务完全完成执行,而是在数据需求可用时立即启动执行任务。因此,暴露了更多的并行性,应用程序可以通过重叠任务的执行来实现更高级别的性能。为了实现这一目标,在本文中,我们建议对基于任务的工作流系统进行两个更改。首先,修改要指定的任务的依赖关系,不仅要根据前导和后继任务,还要根据导致这些依赖关系的数据。其次,在前驱任务生成输出数据时立即触发依赖项的释放,而不必等到前驱任务执行结束时才释放其所有依赖项。我们使用PyCOMPSs来实现这个提议:PyCOMPSs是一个基于任务的编程模型,用于并行化Python应用程序。我们的实验表明,与默认的释放依赖的方法相比,使用急于释放依赖的方法在总执行时间内实现了50%以上的性能提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of High Performance Computing Applications
International Journal of High Performance Computing Applications 工程技术-计算机:跨学科应用
CiteScore
6.10
自引率
6.50%
发文量
32
审稿时长
>12 weeks
期刊介绍: With ever increasing pressure for health services in all countries to meet rising demands, improve their quality and efficiency, and to be more accountable; the need for rigorous research and policy analysis has never been greater. The Journal of Health Services Research & Policy presents the latest scientific research, insightful overviews and reflections on underlying issues, and innovative, thought provoking contributions from leading academics and policy-makers. It provides ideas and hope for solving dilemmas that confront all countries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信