Randomized Approximate Nearest Neighbor Search with Limited Adaptivity

Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures Pub Date : 2016-02-14 DOI:10.1145/2935764.2935776

Mingmou Liu, Xiaoyin Pan, Yitong Yin

{"title":"Randomized Approximate Nearest Neighbor Search with Limited Adaptivity","authors":"Mingmou Liu, Xiaoyin Pan, Yitong Yin","doi":"10.1145/2935764.2935776","DOIUrl":null,"url":null,"abstract":"We study the problem of approximate nearest neighbor search in $d$-dimensional Hamming space {0,1}d. We study the complexity of the problem in the famous cell-probe model, a classic model for data structures. We consider algorithms in the cell-probe model with limited adaptivity, where the algorithm makes k rounds of parallel accesses to the data structure for a given k. For any k ≥ 1, we give a simple randomized algorithm solving the approximate nearest neighbor search using k rounds of parallel memory accesses, with O(k(log d)1/k) accesses in total. We also give a more sophisticated randomized algorithm using O(k+(1/k log d)O(1/k)) memory accesses in k rounds for large enough k. Both algorithms use data structures of size polynomial in n, the number of points in the database. We prove an Ω(1/k(log d)1/k) lower bound for the total number of memory accesses required by any randomized algorithm solving the approximate nearest neighbor search within k ≤ (log log d)/(2 log log log d) rounds of parallel memory accesses on any data structures of polynomial size. This lower bound shows that our first algorithm is asymptotically optimal for any constant round k. And our second algorithm approaches the asymptotically optimal tradeoff between rounds and memory accesses, in a sense that the lower bound of memory accesses for any k1 rounds can be matched by the algorithm within k2=O(k1) rounds. In the extreme, for some large enough k=Θ((log log d)/(log log log d)), our second algorithm matches the Θ((log log d)/(log log log d)) tight bound for fully adaptive algorithms for approximate nearest neighbor search due to Chakrabarti and Regev.","PeriodicalId":346939,"journal":{"name":"Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2935764.2935776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

We study the problem of approximate nearest neighbor search in $d$-dimensional Hamming space {0,1}d. We study the complexity of the problem in the famous cell-probe model, a classic model for data structures. We consider algorithms in the cell-probe model with limited adaptivity, where the algorithm makes k rounds of parallel accesses to the data structure for a given k. For any k ≥ 1, we give a simple randomized algorithm solving the approximate nearest neighbor search using k rounds of parallel memory accesses, with O(k(log d)1/k) accesses in total. We also give a more sophisticated randomized algorithm using O(k+(1/k log d)O(1/k)) memory accesses in k rounds for large enough k. Both algorithms use data structures of size polynomial in n, the number of points in the database. We prove an Ω(1/k(log d)1/k) lower bound for the total number of memory accesses required by any randomized algorithm solving the approximate nearest neighbor search within k ≤ (log log d)/(2 log log log d) rounds of parallel memory accesses on any data structures of polynomial size. This lower bound shows that our first algorithm is asymptotically optimal for any constant round k. And our second algorithm approaches the asymptotically optimal tradeoff between rounds and memory accesses, in a sense that the lower bound of memory accesses for any k1 rounds can be matched by the algorithm within k2=O(k1) rounds. In the extreme, for some large enough k=Θ((log log d)/(log log log d)), our second algorithm matches the Θ((log log d)/(log log log d)) tight bound for fully adaptive algorithms for approximate nearest neighbor search due to Chakrabarti and Regev.

查看原文本刊更多论文

有限自适应随机逼近最近邻搜索

研究了$d$维Hamming空间{0,1}d中的近似最近邻搜索问题。我们在著名的细胞探针模型中研究了问题的复杂性，这是一个经典的数据结构模型。我们考虑具有有限自适应的细胞探针模型中的算法，其中算法对给定k的数据结构进行k轮并行访问。对于任何k≥1，我们给出了一个简单的随机算法，使用k轮并行内存访问来解决近似最近邻搜索，总共有O(k(log d)1/k)次访问。对于足够大的k，我们还给出了一个更复杂的随机算法，使用O(k+(1/k log d)O(1/k))次k轮内存访问。两种算法都使用大小为n的多项式的数据结构，即数据库中的点数。我们证明了在任何多项式大小的数据结构上，在k≤(log log d)/(2 log log log d)轮的并行内存访问中，任何求解近似最近邻搜索的随机化算法所需的内存访问总数的Ω(1/k(log d)1/k)下界。这个下界表明，我们的第一个算法对于任何常数的整数k都是渐近最优的。我们的第二个算法接近于整数和内存访问之间的渐近最优权衡，在某种意义上，任何k1整数的内存访问下界都可以被k2=O(k1)整数的算法匹配。在极端情况下，对于一些足够大的k=Θ((log log d)/(log log log d))，我们的第二种算法匹配Θ((log log d)/(log log log d))紧界，这是由于Chakrabarti和Regev的近似最近邻搜索的完全自适应算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures

自引率

0.00%

发文量