求解一维阵列划分问题的并行动态规划算法的性能评价

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI:10.1109/HPCS.2017.59

H. Salhi, Bchira Ben Mabrouk, Z. Mahjoub

{"title":"求解一维阵列划分问题的并行动态规划算法的性能评价","authors":"H. Salhi, Bchira Ben Mabrouk, Z. Mahjoub","doi":"10.1109/HPCS.2017.59","DOIUrl":null,"url":null,"abstract":"We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of a Parallel Dynamic Programming Algorithm for Solving the 1D Array Partitioning Problem\",\"authors\":\"H. Salhi, Bchira Ben Mabrouk, Z. Mahjoub\",\"doi\":\"10.1109/HPCS.2017.59\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.\",\"PeriodicalId\":115758,\"journal\":{\"name\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS.2017.59\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2017.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们解决了一维阵列划分问题(1D- APP)，这是一个简单的组合优化问题，在文献中已知精确动态规划算法(DPA)。DPA结构为具有仿射环界的完美三do环巢(3DLN)。由于其三次复杂度对于现实世界的大尺寸问题来说可能过于耗时，我们提出了一种并行化方法(PA)。后者从巢内的依赖性分析开始(在以前的工作中提出)，允许导出原始DPA的几个版本，然后保留(理论上)最好的版本。考虑到后者，即3DLN，我们在这里详细介绍的贡献首先包括选择两个对应于两种粒度的任务分割，即fine (resp)。中)粒，其中一粒与第三粒的体相对应。3DLN的第二个循环。随后，我们构建了相应分层任务图的特定层次分解(LDs)，并在任意数量的处理器可用时，基于扫描LDs的层次(细粒度情况下为4个，中粒度情况下为2个)进行了调度(有无层间重叠)。对于每种情况，都明确地确定和分析了计划的完工时间。我们的理论贡献通过在不同输入数据和不同可用处理器数量上实现的一系列模拟得到验证。这允许在不同的调度之间建立一个很好的比较，从而显示它们各自的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Evaluation of a Parallel Dynamic Programming Algorithm for Solving the 1D Array Partitioning Problem

We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量