{"title":"Supernode transformation on computer clusters","authors":"Yong Chen, Weijia Shang, Yi-chun Fang","doi":"10.1109/UMEDIA.2017.8074109","DOIUrl":null,"url":null,"abstract":"Supernode transformation, or tiling, is a technique that partitions algorithms to improve data locality and parallelism to achieve shortest running time, benefiting the computation and data intensive applications such as ubiquitous multimedia services. It groups multiple iterations of nested loops into supernodes subsequently assigned to processors for computing in parallel. This paper focuses on supernode transformation on computer clusters, including supernode scheduling, supernode mapping to cluster nodes, and the choice of the optimal supernode size. The algorithms considered are two nested loops with regular data dependencies. The Longest Common Subsequence problem is used as an illustration. A novel mathematical model for the total execution time is established as a function of the supernode size, algorithm parameters, the computation time of each loop iteration, cluster parameters, and the communication cost. The optimal solution derived from this model leads to better running time than previous research and is validated by simulations.","PeriodicalId":440018,"journal":{"name":"2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UMEDIA.2017.8074109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Supernode transformation, or tiling, is a technique that partitions algorithms to improve data locality and parallelism to achieve shortest running time, benefiting the computation and data intensive applications such as ubiquitous multimedia services. It groups multiple iterations of nested loops into supernodes subsequently assigned to processors for computing in parallel. This paper focuses on supernode transformation on computer clusters, including supernode scheduling, supernode mapping to cluster nodes, and the choice of the optimal supernode size. The algorithms considered are two nested loops with regular data dependencies. The Longest Common Subsequence problem is used as an illustration. A novel mathematical model for the total execution time is established as a function of the supernode size, algorithm parameters, the computation time of each loop iteration, cluster parameters, and the communication cost. The optimal solution derived from this model leads to better running time than previous research and is validated by simulations.