{"title":"求解一维阵列划分问题的并行动态规划算法的性能评价","authors":"H. Salhi, Bchira Ben Mabrouk, Z. Mahjoub","doi":"10.1109/HPCS.2017.59","DOIUrl":null,"url":null,"abstract":"We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of a Parallel Dynamic Programming Algorithm for Solving the 1D Array Partitioning Problem\",\"authors\":\"H. Salhi, Bchira Ben Mabrouk, Z. Mahjoub\",\"doi\":\"10.1109/HPCS.2017.59\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.\",\"PeriodicalId\":115758,\"journal\":{\"name\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS.2017.59\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2017.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Evaluation of a Parallel Dynamic Programming Algorithm for Solving the 1D Array Partitioning Problem
We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach (PA). The latter starts by a dependence analysis within the nest (presented in a previous work) permitting to derive several versions of the original DPA then keep the (theoretically) best one. Considering this latter, a 3DLN, our contribution detailed here first consists in choosing two task segmentations corresponding to two grain sizes i.e. fine (resp. medium) grain where a grain corresponds to the body of the third (resp. second) loop of the 3DLN. Afterwards, we construct particular level decompositions (LDs) of the corresponding layered task graphs and design, when an arbitrary number of processors is available, several schedulings (4 in the fine grain case and 2 in the medium grain case) based on scanning the levels of the LDs with and without inter-level overlapping. For each case the makespans of the schedulings are explicitly determined and analysed. Our theoretical contribution is validated through a series of simulations achieved on different input data and for different numbers of available processors. This permits to establish a fine comparison between the different scheduling thus showing their respective efficiencies.