On approximate nearest neighbors in non-Euclidean spaces

P. Indyk
{"title":"On approximate nearest neighbors in non-Euclidean spaces","authors":"P. Indyk","doi":"10.1109/SFCS.1998.743438","DOIUrl":null,"url":null,"abstract":"The nearest neighbor search (NNS) problem is the following: Given a set of n points P={p/sub 1/,...,p/sub n/} in some metric space X, preprocess P so as to efficiently answer queries which require finding a point in P closest to a query point q/spl isin/X. The approximate nearest neighbor search (c-NNS) is a relaxation of NNS which allows to return any point within c times the distance to the nearest neighbor (called c-nearest neighbor). This problem is of major and growing importance to a variety of applications. In this paper we give an algorithm for (4log/sub 1+/spl rho//log4d+3)-NNS algorithm in l/sub /spl infin///sup d/ with O(dn/sup 1+/spl rho//logn) storage and O(dlogn) query time. In particular this yields the first algorithm for O(1)-NNS for l/sub /spl infin// with subexponential storage. The preprocessing time is linear in the size of the data structure. The algorithm can be also used (after simple modifications) to output the exact nearest neighbor in time bounded bounded O(dlogn) plus the number of (4log/sub 1+/spl rho//log4d+3)-nearest neighbors of the query point. Building on this result, we also obtain an approximation algorithm for a general class of product metrics. Finally: we show that for any c<3 the c-NNS problem in l/sub /spl infin// is provably hard for a version of the indexing model introduced by Hellerstein et al. (1997).","PeriodicalId":228145,"journal":{"name":"Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SFCS.1998.743438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53

Abstract

The nearest neighbor search (NNS) problem is the following: Given a set of n points P={p/sub 1/,...,p/sub n/} in some metric space X, preprocess P so as to efficiently answer queries which require finding a point in P closest to a query point q/spl isin/X. The approximate nearest neighbor search (c-NNS) is a relaxation of NNS which allows to return any point within c times the distance to the nearest neighbor (called c-nearest neighbor). This problem is of major and growing importance to a variety of applications. In this paper we give an algorithm for (4log/sub 1+/spl rho//log4d+3)-NNS algorithm in l/sub /spl infin///sup d/ with O(dn/sup 1+/spl rho//logn) storage and O(dlogn) query time. In particular this yields the first algorithm for O(1)-NNS for l/sub /spl infin// with subexponential storage. The preprocessing time is linear in the size of the data structure. The algorithm can be also used (after simple modifications) to output the exact nearest neighbor in time bounded bounded O(dlogn) plus the number of (4log/sub 1+/spl rho//log4d+3)-nearest neighbors of the query point. Building on this result, we also obtain an approximation algorithm for a general class of product metrics. Finally: we show that for any c<3 the c-NNS problem in l/sub /spl infin// is provably hard for a version of the indexing model introduced by Hellerstein et al. (1997).
非欧几里得空间中的近似近邻
最近邻搜索(NNS)问题如下:给定n个点的集合P={P /sub 1/,…,p/下标n/}在某个度量空间X中,对p进行预处理,以便有效地回答需要在p中找到最接近查询点q/spl isin/X的查询。近似最近邻搜索(c-NNS)是NNS的一种松弛,它允许返回到最近邻居(称为c-最近邻)距离的c倍内的任何点。这个问题对各种应用来说都是重要的,而且越来越重要。本文给出了1 /sub /spl rho//log4d+3)-NNS算法在1 /sub /spl infin///sup d/中,存储时间为0 (dn/sup 1+/spl rho//logn),查询时间为0 (dlogn)。特别是,这产生了第一个O(1)-NNS算法,用于具有亚指数存储的l/sub /spl infin//。预处理时间与数据结构的大小成线性关系。该算法还可用于(经过简单修改)输出查询点在限界O(dlogn)加上(4log/sub 1+/spl rho//log4d+3)个最近邻的精确近邻。在此结果的基础上,我们还得到了一类一般产品度量的近似算法。最后:我们表明,对于任何c<3,对于Hellerstein等人(1997)引入的索引模型版本来说,l/sub /spl infin//中的c- nns问题可证明是困难的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信