An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem

2012 Third International Conference on Networking and Computing Pub Date : 2012-12-05 DOI:10.1109/ICNC.2012.22

Akihiro Uchida, Yasuaki Ito, K. Nakano

{"title":"An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem","authors":"Akihiro Uchida, Yasuaki Ito, K. Nakano","doi":"10.1109/ICNC.2012.22","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPUs) are specialized microprocessors that accelerate graphics operations. Recent GPUs, which have many processing units connected with an off-chip global memory, can be used for general purpose parallel computation. Ant Colony Optimization (ACO) approaches have been introduced as ature-inspired heuristics to find good solutions of the Traveling Salesman Problem (TSP). In ACO approaches, a number of ants traverse the cities of the TSP to find better solutions of the TSP. The ants randomly select next visiting cities based on the probabilities determined by total amounts of their pheromone spread on routes. The main contribution of this paper is to present sophisticated and efficient implementation of one of the ACO approaches on the GPU. In our implementation, we have considered many programming issues of the GPU architecture including coalesced access of global memory, shared memory bank conflicts, etc. In particular, we present a very efficient method for random selection of next cities by a number of ants. Our new method uses iterative random trial which can find next cities in few computational costs with high probability. The experimental results on NVIDIA GeForce GTX 580 show that our implementation for 1002 cities runs in 8.71 seconds, while a conventional CPU implementation runs in 381.95 seconds. Thus, our GPU implementation attains a speed-up factor of 43.47.","PeriodicalId":442973,"journal":{"name":"2012 Third International Conference on Networking and Computing","volume":"22 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2012.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 57

Abstract

Graphics Processing Units (GPUs) are specialized microprocessors that accelerate graphics operations. Recent GPUs, which have many processing units connected with an off-chip global memory, can be used for general purpose parallel computation. Ant Colony Optimization (ACO) approaches have been introduced as ature-inspired heuristics to find good solutions of the Traveling Salesman Problem (TSP). In ACO approaches, a number of ants traverse the cities of the TSP to find better solutions of the TSP. The ants randomly select next visiting cities based on the probabilities determined by total amounts of their pheromone spread on routes. The main contribution of this paper is to present sophisticated and efficient implementation of one of the ACO approaches on the GPU. In our implementation, we have considered many programming issues of the GPU architecture including coalesced access of global memory, shared memory bank conflicts, etc. In particular, we present a very efficient method for random selection of next cities by a number of ants. Our new method uses iterative random trial which can find next cities in few computational costs with high probability. The experimental results on NVIDIA GeForce GTX 580 show that our implementation for 1002 cities runs in 8.71 seconds, while a conventional CPU implementation runs in 381.95 seconds. Thus, our GPU implementation attains a speed-up factor of 43.47.

查看原文本刊更多论文

旅行商问题蚁群优化的高效GPU实现

图形处理单元(gpu)是加速图形操作的专用微处理器。最近的gpu，它有许多处理单元连接到一个片外全局存储器，可以用于通用并行计算。蚁群优化(Ant Colony Optimization, ACO)方法作为一种自然启发的启发式方法被引入到求解旅行商问题(TSP)中。在蚁群算法中，许多蚂蚁遍历TSP的城市以寻找TSP的更好解决方案。蚂蚁根据其信息素在路线上传播的总量决定的概率随机选择下一个访问城市。本文的主要贡献是在GPU上给出了一种蚁群算法的复杂而有效的实现。在我们的实现中，我们考虑了GPU架构的许多编程问题，包括全局内存的合并访问，共享内存库冲突等。特别是，我们提出了一种非常有效的方法来随机选择下一个城市的蚂蚁数量。我们的新方法采用迭代随机试验，可以以较少的计算成本和高概率找到下一个城市。在NVIDIA GeForce GTX 580上的实验结果表明，我们的实现在1002个城市的运行时间为8.71秒，而传统CPU实现的运行时间为381.95秒。因此，我们的GPU实现实现了43.47的加速系数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 Third International Conference on Networking and Computing

自引率

0.00%

发文量