一个有向等周不等式及其在Bregman近邻下界上的应用

A. Abdullah, Suresh Venkatasubramanian
{"title":"一个有向等周不等式及其在Bregman近邻下界上的应用","authors":"A. Abdullah, Suresh Venkatasubramanian","doi":"10.1145/2746539.2746595","DOIUrl":null,"url":null,"abstract":"Bregman divergences are important distance measures that are used in applications such as computer vision, text mining, and speech processing, and are a focus of interest in machine learning due to their information-theoretic properties. There has been extensive study of algorithms for clustering and near neighbor search with respect to these divergences. In all cases, the guarantees depend not just on the data size n and dimensionality d, but also on a structure constant μ ≥ 1 that depends solely on a generating convex function φ and can grow without bound independently. In general, this μ parametrizes the degree to which a given divergence is \"asymmetric\". In this paper, we provide the first evidence that this dependence on μ might be intrinsic. We focus on the problem of ac{ann} search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for c-approximate near-neighbor search that admits r probes must use space Ω(dn1 + μ/c r). In contrast for LSH under l1 the best bound is Ω(dn1+ 1/cr). Our results interpolate between known lower bounds both for LSH-based ANN under l1 as well as the generally harder Partial Match problem (in non-adaptive settings). The bounds match the former when μ is small and the latter when μ is Ω(d/log n). This further strengthens the intuition that Partial Match corresponds to an \"asymmetric\" version of ANN, as well as opening up the possibility of a new line of attack for lower bounds on Partial Match. Our new tool is a directed variant of the standard boolean noise operator. We prove a generalization of the Bonami-Beckner hypercontractivity inequality (restricted to certain subsets of the Hamming cube), and use this to prove the desired directed isoperimetric inequality that we use in our data structure lower bound.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"A Directed Isoperimetric Inequality with application to Bregman Near Neighbor Lower Bounds\",\"authors\":\"A. Abdullah, Suresh Venkatasubramanian\",\"doi\":\"10.1145/2746539.2746595\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bregman divergences are important distance measures that are used in applications such as computer vision, text mining, and speech processing, and are a focus of interest in machine learning due to their information-theoretic properties. There has been extensive study of algorithms for clustering and near neighbor search with respect to these divergences. In all cases, the guarantees depend not just on the data size n and dimensionality d, but also on a structure constant μ ≥ 1 that depends solely on a generating convex function φ and can grow without bound independently. In general, this μ parametrizes the degree to which a given divergence is \\\"asymmetric\\\". In this paper, we provide the first evidence that this dependence on μ might be intrinsic. We focus on the problem of ac{ann} search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for c-approximate near-neighbor search that admits r probes must use space Ω(dn1 + μ/c r). In contrast for LSH under l1 the best bound is Ω(dn1+ 1/cr). Our results interpolate between known lower bounds both for LSH-based ANN under l1 as well as the generally harder Partial Match problem (in non-adaptive settings). The bounds match the former when μ is small and the latter when μ is Ω(d/log n). This further strengthens the intuition that Partial Match corresponds to an \\\"asymmetric\\\" version of ANN, as well as opening up the possibility of a new line of attack for lower bounds on Partial Match. Our new tool is a directed variant of the standard boolean noise operator. We prove a generalization of the Bonami-Beckner hypercontractivity inequality (restricted to certain subsets of the Hamming cube), and use this to prove the desired directed isoperimetric inequality that we use in our data structure lower bound.\",\"PeriodicalId\":20566,\"journal\":{\"name\":\"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2746539.2746595\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2746539.2746595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

Bregman散度是重要的距离度量,用于计算机视觉、文本挖掘和语音处理等应用,并且由于其信息理论特性而成为机器学习的焦点。关于这些散度的聚类和近邻搜索算法已经有了广泛的研究。在所有情况下,保证不仅取决于数据大小n和维数d,而且还取决于结构常数μ≥1,该结构常数仅依赖于生成凸函数φ,并且可以独立地无界增长。一般来说,这个μ参数化了给定散度“不对称”的程度。在本文中,我们提供了第一个证据,证明这种对μ的依赖可能是内在的。我们重点研究了用人工神经网络搜索布雷格曼散度的问题。我们证明了在单元探针模型下,对于允许r个探针的c-近似近邻搜索,任何非自适应数据结构(如位置敏感哈希)都必须使用Ω(dn1+ μ/c r)空间。相反,对于l1下的LSH,最佳界是Ω(dn1+ 1/cr)。我们的结果在l1下基于lsh的人工神经网络的已知下界之间进行插值,以及通常更困难的部分匹配问题(在非自适应设置中)。当μ很小时,边界匹配前者,而当μ为Ω(d/log n)时,边界匹配后者。这进一步加强了部分匹配对应于ANN的“非对称”版本的直觉,并为部分匹配的下界开辟了新的攻击线的可能性。我们的新工具是标准布尔噪声算子的有向变体。我们证明了Bonami-Beckner超收缩不等式的推广(仅限于Hamming立方体的某些子集),并用它来证明我们在数据结构下界中使用的期望的有向等周不等式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Directed Isoperimetric Inequality with application to Bregman Near Neighbor Lower Bounds
Bregman divergences are important distance measures that are used in applications such as computer vision, text mining, and speech processing, and are a focus of interest in machine learning due to their information-theoretic properties. There has been extensive study of algorithms for clustering and near neighbor search with respect to these divergences. In all cases, the guarantees depend not just on the data size n and dimensionality d, but also on a structure constant μ ≥ 1 that depends solely on a generating convex function φ and can grow without bound independently. In general, this μ parametrizes the degree to which a given divergence is "asymmetric". In this paper, we provide the first evidence that this dependence on μ might be intrinsic. We focus on the problem of ac{ann} search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for c-approximate near-neighbor search that admits r probes must use space Ω(dn1 + μ/c r). In contrast for LSH under l1 the best bound is Ω(dn1+ 1/cr). Our results interpolate between known lower bounds both for LSH-based ANN under l1 as well as the generally harder Partial Match problem (in non-adaptive settings). The bounds match the former when μ is small and the latter when μ is Ω(d/log n). This further strengthens the intuition that Partial Match corresponds to an "asymmetric" version of ANN, as well as opening up the possibility of a new line of attack for lower bounds on Partial Match. Our new tool is a directed variant of the standard boolean noise operator. We prove a generalization of the Bonami-Beckner hypercontractivity inequality (restricted to certain subsets of the Hamming cube), and use this to prove the desired directed isoperimetric inequality that we use in our data structure lower bound.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信