有限内存条件下任务树的并行调度

IF 0.9 Q3 COMPUTER SCIENCE, THEORY & METHODS
Lionel Eyraud-Dubois, L. Marchal, O. Sinnen, F. Vivien
{"title":"有限内存条件下任务树的并行调度","authors":"Lionel Eyraud-Dubois, L. Marchal, O. Sinnen, F. Vivien","doi":"10.1145/2779052","DOIUrl":null,"url":null,"abstract":"This article investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only be removed from memory after the completion of the task that uses it as an input data. Such trees arise in the multifrontal method of sparse matrix factorization. The peak memory needed for the processing of the entire tree depends on the execution order of the tasks. With one processor, the objective of the tree traversal is to minimize the required memory. This problem was well studied, and optimal polynomial algorithms were proposed.\n Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With multiple processors comes the additional objective to minimize the time needed to traverse the tree—that is, to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide inapproximability results even for unit weight trees. We design a series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan. Some of these heuristics are able to process a tree while keeping the memory usage under a given memory limit. The different heuristics are evaluated in an extensive experimental evaluation using realistic trees.","PeriodicalId":42115,"journal":{"name":"ACM Transactions on Parallel Computing","volume":"25 1","pages":"13:1-13:37"},"PeriodicalIF":0.9000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"Parallel Scheduling of Task Trees with Limited Memory\",\"authors\":\"Lionel Eyraud-Dubois, L. Marchal, O. Sinnen, F. Vivien\",\"doi\":\"10.1145/2779052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only be removed from memory after the completion of the task that uses it as an input data. Such trees arise in the multifrontal method of sparse matrix factorization. The peak memory needed for the processing of the entire tree depends on the execution order of the tasks. With one processor, the objective of the tree traversal is to minimize the required memory. This problem was well studied, and optimal polynomial algorithms were proposed.\\n Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With multiple processors comes the additional objective to minimize the time needed to traverse the tree—that is, to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide inapproximability results even for unit weight trees. We design a series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan. Some of these heuristics are able to process a tree while keeping the memory usage under a given memory limit. The different heuristics are evaluated in an extensive experimental evaluation using realistic trees.\",\"PeriodicalId\":42115,\"journal\":{\"name\":\"ACM Transactions on Parallel Computing\",\"volume\":\"25 1\",\"pages\":\"13:1-13:37\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Parallel Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2779052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Parallel Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2779052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 31

摘要

本文研究使用多个处理器执行树形任务图。这种树的每条边代表一些大数据。只有当所有输入和输出数据都适合内存时,任务才能执行,并且数据只有在将其用作输入数据的任务完成后才能从内存中删除。这种树出现在稀疏矩阵分解的多正面方法中。处理整个树所需的峰值内存取决于任务的执行顺序。对于一个处理器,树遍历的目标是最小化所需的内存。对该问题进行了深入的研究,并提出了最优多项式算法。在这里,我们通过考虑多处理器来扩展问题,这在矩阵分解的应用领域有明显的兴趣。多处理器带来了另一个目标,即最小化遍历树所需的时间,即最小化makespan。不出所料,这个问题比顺序问题要困难得多。我们研究了这一问题的计算复杂度,并给出了单位权重树的不逼近性结果。我们设计了一系列实用的启发式方法,在最小化峰值内存使用和最大跨度之间实现不同的权衡。其中一些启发式方法能够在处理树的同时将内存使用保持在给定的内存限制之下。不同的启发式评估在广泛的实验评估使用现实树。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parallel Scheduling of Task Trees with Limited Memory
This article investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only be removed from memory after the completion of the task that uses it as an input data. Such trees arise in the multifrontal method of sparse matrix factorization. The peak memory needed for the processing of the entire tree depends on the execution order of the tasks. With one processor, the objective of the tree traversal is to minimize the required memory. This problem was well studied, and optimal polynomial algorithms were proposed. Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With multiple processors comes the additional objective to minimize the time needed to traverse the tree—that is, to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide inapproximability results even for unit weight trees. We design a series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan. Some of these heuristics are able to process a tree while keeping the memory usage under a given memory limit. The different heuristics are evaluated in an extensive experimental evaluation using realistic trees.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing COMPUTER SCIENCE, THEORY & METHODS-
CiteScore
4.10
自引率
0.00%
发文量
16
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信