{"title":"234调度3-2和2-1消除,用于使用非2次幂进程数的并行图像合成","authors":"J. Nonaka, K. Ono, M. Fujita","doi":"10.1109/HPCSim.2015.7237071","DOIUrl":null,"url":null,"abstract":"Binary-Swap is a parallel image compositing algorithm based on recursive vector halving and distance doubling, and works efficiently when the number of processes is exactly a power-of-two (2n). Several power-of-two converting approaches for Binary-Swap have been proposed. Among them, the Telescope method, based on the Binary Blocks algorithm, has been shown as the most promising approach. The Telescope method decomposes an entire set of processes into blocks of power-of-two size and merges the smaller blocks into larger blocks in stepwise fashion. This block merging process corresponds to the communication and computational overhead of the conversion, and since it can only merge one block per stage, it becomes inefficient as the number of binary blocks increases. In this paper, we focus on a single-stage conversion method using the 3-2 and 2-1 elimination approaches. The original scheduling method, proposed by Rabenseifner et al., is limited to an odd number of processes since it always schedules a single 3-2 elimination per conversion. Taking into consideration that the 3-2 elimination can be optimized on modern HPC systems, which can overlap the communication and computation, we propose 234 Scheduling for scheduling multiple 3-2 eliminations per conversion. The multiple 3-2 elimination scheduling enlarges the application range by enabling its use on an even number of processes. We evaluated 234 Scheduling applied to Binary-Swap on the K computer, which is a modern parallel HPC system, and confirmed its effectiveness.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"234 scheduling of 3-2 and 2-1 eliminations for parallel image compositing using non-power-of-two number of processes\",\"authors\":\"J. Nonaka, K. Ono, M. Fujita\",\"doi\":\"10.1109/HPCSim.2015.7237071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Binary-Swap is a parallel image compositing algorithm based on recursive vector halving and distance doubling, and works efficiently when the number of processes is exactly a power-of-two (2n). Several power-of-two converting approaches for Binary-Swap have been proposed. Among them, the Telescope method, based on the Binary Blocks algorithm, has been shown as the most promising approach. The Telescope method decomposes an entire set of processes into blocks of power-of-two size and merges the smaller blocks into larger blocks in stepwise fashion. This block merging process corresponds to the communication and computational overhead of the conversion, and since it can only merge one block per stage, it becomes inefficient as the number of binary blocks increases. In this paper, we focus on a single-stage conversion method using the 3-2 and 2-1 elimination approaches. The original scheduling method, proposed by Rabenseifner et al., is limited to an odd number of processes since it always schedules a single 3-2 elimination per conversion. Taking into consideration that the 3-2 elimination can be optimized on modern HPC systems, which can overlap the communication and computation, we propose 234 Scheduling for scheduling multiple 3-2 eliminations per conversion. The multiple 3-2 elimination scheduling enlarges the application range by enabling its use on an even number of processes. We evaluated 234 Scheduling applied to Binary-Swap on the K computer, which is a modern parallel HPC system, and confirmed its effectiveness.\",\"PeriodicalId\":134009,\"journal\":{\"name\":\"2015 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSim.2015.7237071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2015.7237071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
234 scheduling of 3-2 and 2-1 eliminations for parallel image compositing using non-power-of-two number of processes
Binary-Swap is a parallel image compositing algorithm based on recursive vector halving and distance doubling, and works efficiently when the number of processes is exactly a power-of-two (2n). Several power-of-two converting approaches for Binary-Swap have been proposed. Among them, the Telescope method, based on the Binary Blocks algorithm, has been shown as the most promising approach. The Telescope method decomposes an entire set of processes into blocks of power-of-two size and merges the smaller blocks into larger blocks in stepwise fashion. This block merging process corresponds to the communication and computational overhead of the conversion, and since it can only merge one block per stage, it becomes inefficient as the number of binary blocks increases. In this paper, we focus on a single-stage conversion method using the 3-2 and 2-1 elimination approaches. The original scheduling method, proposed by Rabenseifner et al., is limited to an odd number of processes since it always schedules a single 3-2 elimination per conversion. Taking into consideration that the 3-2 elimination can be optimized on modern HPC systems, which can overlap the communication and computation, we propose 234 Scheduling for scheduling multiple 3-2 eliminations per conversion. The multiple 3-2 elimination scheduling enlarges the application range by enabling its use on an even number of processes. We evaluated 234 Scheduling applied to Binary-Swap on the K computer, which is a modern parallel HPC system, and confirmed its effectiveness.