An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem

Akihiro Uchida, Yasuaki Ito, K. Nakano
{"title":"An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem","authors":"Akihiro Uchida, Yasuaki Ito, K. Nakano","doi":"10.1109/ICNC.2012.22","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPUs) are specialized microprocessors that accelerate graphics operations. Recent GPUs, which have many processing units connected with an off-chip global memory, can be used for general purpose parallel computation. Ant Colony Optimization (ACO) approaches have been introduced as ature-inspired heuristics to find good solutions of the Traveling Salesman Problem (TSP). In ACO approaches, a number of ants traverse the cities of the TSP to find better solutions of the TSP. The ants randomly select next visiting cities based on the probabilities determined by total amounts of their pheromone spread on routes. The main contribution of this paper is to present sophisticated and efficient implementation of one of the ACO approaches on the GPU. In our implementation, we have considered many programming issues of the GPU architecture including coalesced access of global memory, shared memory bank conflicts, etc. In particular, we present a very efficient method for random selection of next cities by a number of ants. Our new method uses iterative random trial which can find next cities in few computational costs with high probability. The experimental results on NVIDIA GeForce GTX 580 show that our implementation for 1002 cities runs in 8.71 seconds, while a conventional CPU implementation runs in 381.95 seconds. Thus, our GPU implementation attains a speed-up factor of 43.47.","PeriodicalId":442973,"journal":{"name":"2012 Third International Conference on Networking and Computing","volume":"22 18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2012.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57

Abstract

Graphics Processing Units (GPUs) are specialized microprocessors that accelerate graphics operations. Recent GPUs, which have many processing units connected with an off-chip global memory, can be used for general purpose parallel computation. Ant Colony Optimization (ACO) approaches have been introduced as ature-inspired heuristics to find good solutions of the Traveling Salesman Problem (TSP). In ACO approaches, a number of ants traverse the cities of the TSP to find better solutions of the TSP. The ants randomly select next visiting cities based on the probabilities determined by total amounts of their pheromone spread on routes. The main contribution of this paper is to present sophisticated and efficient implementation of one of the ACO approaches on the GPU. In our implementation, we have considered many programming issues of the GPU architecture including coalesced access of global memory, shared memory bank conflicts, etc. In particular, we present a very efficient method for random selection of next cities by a number of ants. Our new method uses iterative random trial which can find next cities in few computational costs with high probability. The experimental results on NVIDIA GeForce GTX 580 show that our implementation for 1002 cities runs in 8.71 seconds, while a conventional CPU implementation runs in 381.95 seconds. Thus, our GPU implementation attains a speed-up factor of 43.47.
旅行商问题蚁群优化的高效GPU实现
图形处理单元(gpu)是加速图形操作的专用微处理器。最近的gpu,它有许多处理单元连接到一个片外全局存储器,可以用于通用并行计算。蚁群优化(Ant Colony Optimization, ACO)方法作为一种自然启发的启发式方法被引入到求解旅行商问题(TSP)中。在蚁群算法中,许多蚂蚁遍历TSP的城市以寻找TSP的更好解决方案。蚂蚁根据其信息素在路线上传播的总量决定的概率随机选择下一个访问城市。本文的主要贡献是在GPU上给出了一种蚁群算法的复杂而有效的实现。在我们的实现中,我们考虑了GPU架构的许多编程问题,包括全局内存的合并访问,共享内存库冲突等。特别是,我们提出了一种非常有效的方法来随机选择下一个城市的蚂蚁数量。我们的新方法采用迭代随机试验,可以以较少的计算成本和高概率找到下一个城市。在NVIDIA GeForce GTX 580上的实验结果表明,我们的实现在1002个城市的运行时间为8.71秒,而传统CPU实现的运行时间为381.95秒。因此,我们的GPU实现实现了43.47的加速系数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信