{"title":"GPU上最近邻查询的并行树遍历","authors":"Moohyeon Nam, Jinwoong Kim, Beomseok Nam","doi":"10.1109/ICPP.2016.20","DOIUrl":null,"url":null,"abstract":"The similarity search problem is found in many application domains including computer graphics, information retrieval, statistics, computational biology, and scientific data processing just to name a few. Recently several studies have been performed to accelerate the k-nearest neighbor (kNN) queries using GPUs, but most of the works develop brute-force exhaustive scanning algorithms leveraging a large number of GPU cores and none of the prior works employ GPUs for an n-ary tree structured index. It is known that multi-dimensional hierarchical indexing trees such as R-trees are inherently not well suited for GPUs because of their irregular tree traversal and memory access patterns. Traversing hierarchical tree structures in an irregular manner makes it difficult to exploit parallelism since GPUs are tailored for deterministic memory accesses. In this work, we develop a data parallel tree traversal algorithm, Parallel Scan and Backtrack (PSB), for kNN query processing on the GPU, this algorithm traverses a multi-dimensional tree structured index while avoiding warp divergence problems. In order to take advantage of accessing contiguous memory blocks, the proposed PSB algorithm performs linear scanning of sibling leaf nodes, which increases the chance to optimize the parallel SIMD algorithm. We evaluate the performance of the PSB algorithm against the classic branch-and-bound kNN query processing algorithm. Our experiments with real datasets show that the PSB algorithm is faster by a large margin than the branch-and-bound algorithm.","PeriodicalId":409991,"journal":{"name":"2016 45th International Conference on Parallel Processing (ICPP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Parallel Tree Traversal for Nearest Neighbor Query on the GPU\",\"authors\":\"Moohyeon Nam, Jinwoong Kim, Beomseok Nam\",\"doi\":\"10.1109/ICPP.2016.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The similarity search problem is found in many application domains including computer graphics, information retrieval, statistics, computational biology, and scientific data processing just to name a few. Recently several studies have been performed to accelerate the k-nearest neighbor (kNN) queries using GPUs, but most of the works develop brute-force exhaustive scanning algorithms leveraging a large number of GPU cores and none of the prior works employ GPUs for an n-ary tree structured index. It is known that multi-dimensional hierarchical indexing trees such as R-trees are inherently not well suited for GPUs because of their irregular tree traversal and memory access patterns. Traversing hierarchical tree structures in an irregular manner makes it difficult to exploit parallelism since GPUs are tailored for deterministic memory accesses. In this work, we develop a data parallel tree traversal algorithm, Parallel Scan and Backtrack (PSB), for kNN query processing on the GPU, this algorithm traverses a multi-dimensional tree structured index while avoiding warp divergence problems. In order to take advantage of accessing contiguous memory blocks, the proposed PSB algorithm performs linear scanning of sibling leaf nodes, which increases the chance to optimize the parallel SIMD algorithm. We evaluate the performance of the PSB algorithm against the classic branch-and-bound kNN query processing algorithm. Our experiments with real datasets show that the PSB algorithm is faster by a large margin than the branch-and-bound algorithm.\",\"PeriodicalId\":409991,\"journal\":{\"name\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 45th International Conference on Parallel Processing (ICPP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2016.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 45th International Conference on Parallel Processing (ICPP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2016.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallel Tree Traversal for Nearest Neighbor Query on the GPU
The similarity search problem is found in many application domains including computer graphics, information retrieval, statistics, computational biology, and scientific data processing just to name a few. Recently several studies have been performed to accelerate the k-nearest neighbor (kNN) queries using GPUs, but most of the works develop brute-force exhaustive scanning algorithms leveraging a large number of GPU cores and none of the prior works employ GPUs for an n-ary tree structured index. It is known that multi-dimensional hierarchical indexing trees such as R-trees are inherently not well suited for GPUs because of their irregular tree traversal and memory access patterns. Traversing hierarchical tree structures in an irregular manner makes it difficult to exploit parallelism since GPUs are tailored for deterministic memory accesses. In this work, we develop a data parallel tree traversal algorithm, Parallel Scan and Backtrack (PSB), for kNN query processing on the GPU, this algorithm traverses a multi-dimensional tree structured index while avoiding warp divergence problems. In order to take advantage of accessing contiguous memory blocks, the proposed PSB algorithm performs linear scanning of sibling leaf nodes, which increases the chance to optimize the parallel SIMD algorithm. We evaluate the performance of the PSB algorithm against the classic branch-and-bound kNN query processing algorithm. Our experiments with real datasets show that the PSB algorithm is faster by a large margin than the branch-and-bound algorithm.