{"title":"近似字符串匹配算法在GPU上的高效内存实现","authors":"L. S. N. Nunes, J. Bordim, K. Nakano, Yasuaki Ito","doi":"10.1109/CANDAR.2016.0090","DOIUrl":null,"url":null,"abstract":"The task of finding strings having a partial match to a given pattern is of interest to a number of practical applications, including DNA sequencing and text searching. Owing to its importance, alternatives to accelerate the Approximate String Matching (ASM) have been widely investigated in the literature. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The key idea of our implementation relies on warp shuffle operations, which are used to reduce the communication overhead between threads. Experimental results, carried out on a GeForce GTX 960 GPU, show that the proposed implementation provides acceleration between 1.31 and 1.84 times when compared to another noteworthy alternative.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Memory-Access-Efficient Implementation of the Approximate String Matching Algorithm on GPU\",\"authors\":\"L. S. N. Nunes, J. Bordim, K. Nakano, Yasuaki Ito\",\"doi\":\"10.1109/CANDAR.2016.0090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The task of finding strings having a partial match to a given pattern is of interest to a number of practical applications, including DNA sequencing and text searching. Owing to its importance, alternatives to accelerate the Approximate String Matching (ASM) have been widely investigated in the literature. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The key idea of our implementation relies on warp shuffle operations, which are used to reduce the communication overhead between threads. Experimental results, carried out on a GeForce GTX 960 GPU, show that the proposed implementation provides acceleration between 1.31 and 1.84 times when compared to another noteworthy alternative.\",\"PeriodicalId\":322499,\"journal\":{\"name\":\"2016 Fourth International Symposium on Computing and Networking (CANDAR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Fourth International Symposium on Computing and Networking (CANDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CANDAR.2016.0090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CANDAR.2016.0090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Memory-Access-Efficient Implementation of the Approximate String Matching Algorithm on GPU
The task of finding strings having a partial match to a given pattern is of interest to a number of practical applications, including DNA sequencing and text searching. Owing to its importance, alternatives to accelerate the Approximate String Matching (ASM) have been widely investigated in the literature. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The key idea of our implementation relies on warp shuffle operations, which are used to reduce the communication overhead between threads. Experimental results, carried out on a GeForce GTX 960 GPU, show that the proposed implementation provides acceleration between 1.31 and 1.84 times when compared to another noteworthy alternative.