Salah Saleh, Marwan Abdellah, Ahmed A. Abdel Raouf, Y. Kadah
{"title":"High performance CUDA-based implementation for the 2D version of the Maximum Subarray Problem (MSP)","authors":"Salah Saleh, Marwan Abdellah, Ahmed A. Abdel Raouf, Y. Kadah","doi":"10.1109/CIBEC.2012.6473291","DOIUrl":null,"url":null,"abstract":"The Maximum Subarray Problem (MSP) finds a segment of an array that has the maximum summation over all the other possible combinations. Different applications for this problem exist in various fields like genomic sequence analysis, data mining and computer vision. Several optimum linear-time solutions exist for the 1D version, however, the known upper bounds for the 2D version are cubic or near-cubic time; which makes it a problem of high complexity. In this work, a stage by stage high performance Graphics Processing Unit (GPU)-based implementation for solving the 2D version of the problem in a linear time relying on the Compute Unified Device Architecture (CUDA) technology is presented. It achieves more than 7X of speedup in performance compared to a single-threaded sequential implementation on the Central Processing Unit (CPU) for an array of size 5122.","PeriodicalId":416740,"journal":{"name":"2012 Cairo International Biomedical Engineering Conference (CIBEC)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Cairo International Biomedical Engineering Conference (CIBEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBEC.2012.6473291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The Maximum Subarray Problem (MSP) finds a segment of an array that has the maximum summation over all the other possible combinations. Different applications for this problem exist in various fields like genomic sequence analysis, data mining and computer vision. Several optimum linear-time solutions exist for the 1D version, however, the known upper bounds for the 2D version are cubic or near-cubic time; which makes it a problem of high complexity. In this work, a stage by stage high performance Graphics Processing Unit (GPU)-based implementation for solving the 2D version of the problem in a linear time relying on the Compute Unified Device Architecture (CUDA) technology is presented. It achieves more than 7X of speedup in performance compared to a single-threaded sequential implementation on the Central Processing Unit (CPU) for an array of size 5122.