基于非线性谱隙的数据相关哈希

Alexandr Andoni, A. Naor, Aleksandar Nikolov, Ilya P. Razenshteyn, Erik Waingarten
{"title":"基于非线性谱隙的数据相关哈希","authors":"Alexandr Andoni, A. Naor, Aleksandar Nikolov, Ilya P. Razenshteyn, Erik Waingarten","doi":"10.1145/3188745.3188846","DOIUrl":null,"url":null,"abstract":"We establish a generic reduction from _nonlinear spectral gaps_ of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: * For _general_ d-dimensional normed spaces and n-point datasets, we obtain a _cell-probe_ ANN data structure with approximation O(logd/ε2), space dO(1) n1+ε, and dO(1)nε cell probes per query, for any ε>0. No non-trivial approximation was known before in this generality other than the O(√d) bound which follows from embedding a general norm into ℓ2. * For ℓp and Schatten-p norms, we improve the data structure further, to obtain approximation O(p) and sublinear query _time_. For ℓp, this improves upon the previous best approximation 2O(p) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a ”tractable” space, such as ℓ1. Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Data-dependent hashing via nonlinear spectral gaps\",\"authors\":\"Alexandr Andoni, A. Naor, Aleksandar Nikolov, Ilya P. Razenshteyn, Erik Waingarten\",\"doi\":\"10.1145/3188745.3188846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We establish a generic reduction from _nonlinear spectral gaps_ of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: * For _general_ d-dimensional normed spaces and n-point datasets, we obtain a _cell-probe_ ANN data structure with approximation O(logd/ε2), space dO(1) n1+ε, and dO(1)nε cell probes per query, for any ε>0. No non-trivial approximation was known before in this generality other than the O(√d) bound which follows from embedding a general norm into ℓ2. * For ℓp and Schatten-p norms, we improve the data structure further, to obtain approximation O(p) and sublinear query _time_. For ℓp, this improves upon the previous best approximation 2O(p) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a ”tractable” space, such as ℓ1. Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.\",\"PeriodicalId\":20593,\"journal\":{\"name\":\"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3188745.3188846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3188745.3188846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

摘要

我们建立了从度量空间的非线性谱隙到数据相关的位置敏感哈希的一般约简,给出了一种解决各种距离函数下的高维近似近邻搜索问题的新方法。*对于一般d维归范空间和n点数据集,我们得到了近似为O(logd/ε2)的_cell-probe_ ANN数据结构,对于任意ε>,每次查询的空间dO(1) n1+ε和dO(1)nε细胞探针。在此一般性中,除了将一般范数嵌入到l2中所得到的O(√d)界之外,没有已知的非平凡近似。*对于p和schattenp范数,我们进一步改进了数据结构,得到近似O(p)和次线性查询_time_。对于p,这改进了之前的最佳近似2O(p)(它需要多项式,而不是n空间中的近线性)。对于schattenp范数,在此工作之前没有已知的非平凡ANN数据结构。以前的人工神经网络问题的方法要么利用度量的低维,在维度上需要空间指数,要么通过将度量嵌入到“可处理”的空间(如1)来规避维度的诅咒。我们的新通用约简使用一种新的划分方法,与这两种方法不同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-dependent hashing via nonlinear spectral gaps
We establish a generic reduction from _nonlinear spectral gaps_ of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: * For _general_ d-dimensional normed spaces and n-point datasets, we obtain a _cell-probe_ ANN data structure with approximation O(logd/ε2), space dO(1) n1+ε, and dO(1)nε cell probes per query, for any ε>0. No non-trivial approximation was known before in this generality other than the O(√d) bound which follows from embedding a general norm into ℓ2. * For ℓp and Schatten-p norms, we improve the data structure further, to obtain approximation O(p) and sublinear query _time_. For ℓp, this improves upon the previous best approximation 2O(p) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a ”tractable” space, such as ℓ1. Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信