A SIMD Solution for the Quadratic Assignment Problem with GPU Acceleration

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.) Pub Date : 2014-07-13 DOI:10.1145/2616498.2616521

Abhilash Chaparala, C. Novoa, Apan Qasem

{"title":"A SIMD Solution for the Quadratic Assignment Problem with GPU Acceleration","authors":"Abhilash Chaparala, C. Novoa, Apan Qasem","doi":"10.1145/2616498.2616521","DOIUrl":null,"url":null,"abstract":"In the Quadratic Assignment Problem (QAP), n units (usually departments, machines, or electronic components) must be assigned to n locations given the distance between the locations and the flow between the units. The goal is to find the assignment that minimizes the sum of the products of distance traveled and flow between units. The QAP is a combinatorial problem difficult to solve to optimality even for problems where n is relatively small (e.g., n = 30). In this paper, we solve the QAP problem using a parallel algorithm that employs a 2-opt heuristic and leverages the compute capabilities of current GPUs. The algorithm is implemented on the Stampede cluster hosted by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin and on a GPU-equipped server housed at Texas State University. We enhance our implementation by fine tuning the occupancy levels and by exploiting inter-thread data locality through improved shared memory allocation. On a series of experiments on the well-known QAPLIB data sets, our algorithm, on average, outperforms an OpenMP implementation by a factor of 16.31 and a Tabu search based GPU implementation by a factor of 58.61. Given the wide applicability of QAP, the algorithm we propose has very good potential to accelerate the discovery in scholarly research in Engineering, particularly in the fields of Operations Research and design of electronic devices.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"1 1","pages":"1:1-1:8"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2616498.2616521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

In the Quadratic Assignment Problem (QAP), n units (usually departments, machines, or electronic components) must be assigned to n locations given the distance between the locations and the flow between the units. The goal is to find the assignment that minimizes the sum of the products of distance traveled and flow between units. The QAP is a combinatorial problem difficult to solve to optimality even for problems where n is relatively small (e.g., n = 30). In this paper, we solve the QAP problem using a parallel algorithm that employs a 2-opt heuristic and leverages the compute capabilities of current GPUs. The algorithm is implemented on the Stampede cluster hosted by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin and on a GPU-equipped server housed at Texas State University. We enhance our implementation by fine tuning the occupancy levels and by exploiting inter-thread data locality through improved shared memory allocation. On a series of experiments on the well-known QAPLIB data sets, our algorithm, on average, outperforms an OpenMP implementation by a factor of 16.31 and a Tabu search based GPU implementation by a factor of 58.61. Given the wide applicability of QAP, the algorithm we propose has very good potential to accelerate the discovery in scholarly research in Engineering, particularly in the fields of Operations Research and design of electronic devices.

查看原文本刊更多论文

GPU加速下二次分配问题的SIMD解

在二次分配问题(QAP)中，n个单元(通常是部门、机器或电子元件)必须分配到n个位置，给定位置之间的距离和单元之间的流量。目标是找到一个分配，使单位之间的距离和流量的乘积的总和最小化。QAP是一个组合问题，即使对于n相对较小(例如，n = 30)的问题，也很难求解到最优性。在本文中，我们使用一种采用2-opt启发式的并行算法来解决QAP问题，并利用当前gpu的计算能力。该算法在Stampede集群上实现，该集群由位于德克萨斯大学奥斯汀分校的德克萨斯高级计算中心(TACC)托管，并在位于德克萨斯州立大学的一台配备gpu的服务器上实现。我们通过微调占用级别和通过改进共享内存分配利用线程间数据局部性来增强我们的实现。在著名的QAPLIB数据集上的一系列实验中，我们的算法平均比OpenMP实现高出16.31倍，比基于禁忌搜索的GPU实现高出58.61倍。鉴于QAP的广泛适用性，我们提出的算法具有很好的潜力，可以加速工程领域的学术研究，特别是在运筹学和电子设备设计领域的发现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

自引率

0.00%

发文量