{"title":"Scheduling associative reductions with homogeneous costs when overlapping communications and computations","authors":"Louis-Claude Canon","doi":"10.1109/HiPC.2013.6799124","DOIUrl":null,"url":null,"abstract":"Reduction is a core operation in parallel computing that combines distributed elements into a single result. Optimizing its cost may greatly reduce the application execution time, notably in MPI and MapReduce computations. In this paper, we propose an algorithm for scheduling associative reductions. We focus on the case where communications and computations can be overlapped to fully exploit resources. Our algorithm greedily builds a spanning tree by starting from the root and by adding a child at each iteration. Bounds on the completion time of optimal schedules are then characterized. To show the algorithm extensibility, we adapt it to model variations in which either communication or computation resources are limited. Moreover, we study two specific spanning trees: while the binomial tree is optimal when there is either no transfer or no computation, the k-ary Fibonacci tree is optimal when the transfer cost is equal to the computation cost. Finally, approximation ratios of strategies based on those trees are derived.","PeriodicalId":206307,"journal":{"name":"20th Annual International Conference on High Performance Computing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"20th Annual International Conference on High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2013.6799124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Reduction is a core operation in parallel computing that combines distributed elements into a single result. Optimizing its cost may greatly reduce the application execution time, notably in MPI and MapReduce computations. In this paper, we propose an algorithm for scheduling associative reductions. We focus on the case where communications and computations can be overlapped to fully exploit resources. Our algorithm greedily builds a spanning tree by starting from the root and by adding a child at each iteration. Bounds on the completion time of optimal schedules are then characterized. To show the algorithm extensibility, we adapt it to model variations in which either communication or computation resources are limited. Moreover, we study two specific spanning trees: while the binomial tree is optimal when there is either no transfer or no computation, the k-ary Fibonacci tree is optimal when the transfer cost is equal to the computation cost. Finally, approximation ratios of strategies based on those trees are derived.