Optimal Pivots to Minimize the Index Size for Metric Access Methods

Luis González Ares, N. Brisaboa, María F. Esteller, Oscar Pedreira, Á. Places
{"title":"Optimal Pivots to Minimize the Index Size for Metric Access Methods","authors":"Luis González Ares, N. Brisaboa, María F. Esteller, Oscar Pedreira, Á. Places","doi":"10.1109/SISAP.2009.21","DOIUrl":null,"url":null,"abstract":"We consider the problem of similarity search in metric spaces with costly distance functions and large databases. There is a trade-off between the amount of information stored in the index and the reduction in the number of comparisons for solving a query. Pivot-based methods clearly outperform clustering-based ones in number of comparisons, but their space requirements are higher and this can prevent their application in real problems. Therefore, several strategies have been proposed that reduce the space needed by pivot-based methods, as BAESA, FQA or KVP. In this paper, we analyze the usefulness of pivots depending on their proximity to the object. As consequence of this analysis, we propose a new pivot-based method that requires an amount of space equal or very close to that needed by clustering-based methods. We provide experimental results that show that our proposal represents a competitive strategy to clustering oriented solutions when using the same amount of memory.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"153 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Second International Workshop on Similarity Search and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SISAP.2009.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

We consider the problem of similarity search in metric spaces with costly distance functions and large databases. There is a trade-off between the amount of information stored in the index and the reduction in the number of comparisons for solving a query. Pivot-based methods clearly outperform clustering-based ones in number of comparisons, but their space requirements are higher and this can prevent their application in real problems. Therefore, several strategies have been proposed that reduce the space needed by pivot-based methods, as BAESA, FQA or KVP. In this paper, we analyze the usefulness of pivots depending on their proximity to the object. As consequence of this analysis, we propose a new pivot-based method that requires an amount of space equal or very close to that needed by clustering-based methods. We provide experimental results that show that our proposal represents a competitive strategy to clustering oriented solutions when using the same amount of memory.
最小化度量访问方法索引大小的最优枢轴
我们考虑了度量空间中具有昂贵距离函数和大型数据库的相似性搜索问题。在索引中存储的信息量和减少求解查询的比较次数之间存在权衡。在比较次数上,基于枢轴的方法明显优于基于聚类的方法,但是它们对空间的要求更高,这阻碍了它们在实际问题中的应用。因此,已经提出了几种策略来减少基于支点的方法所需的空间,如BAESA, FQA或KVP。在本文中,我们分析了轴的有用性取决于它们与目标的接近程度。作为这种分析的结果,我们提出了一种新的基于枢轴的方法,它需要的空间量等于或非常接近基于聚类的方法所需的空间量。我们提供的实验结果表明,当使用相同数量的内存时,我们的建议代表了面向集群的解决方案的竞争策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信