Optimal Pivots to Minimize the Index Size for Metric Access Methods

2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-08-29 DOI:10.1109/SISAP.2009.21

Luis González Ares, N. Brisaboa, María F. Esteller, Oscar Pedreira, Á. Places

引用次数: 16

Abstract

We consider the problem of similarity search in metric spaces with costly distance functions and large databases. There is a trade-off between the amount of information stored in the index and the reduction in the number of comparisons for solving a query. Pivot-based methods clearly outperform clustering-based ones in number of comparisons, but their space requirements are higher and this can prevent their application in real problems. Therefore, several strategies have been proposed that reduce the space needed by pivot-based methods, as BAESA, FQA or KVP. In this paper, we analyze the usefulness of pivots depending on their proximity to the object. As consequence of this analysis, we propose a new pivot-based method that requires an amount of space equal or very close to that needed by clustering-based methods. We provide experimental results that show that our proposal represents a competitive strategy to clustering oriented solutions when using the same amount of memory.

查看原文本刊更多论文

最小化度量访问方法索引大小的最优枢轴

我们考虑了度量空间中具有昂贵距离函数和大型数据库的相似性搜索问题。在索引中存储的信息量和减少求解查询的比较次数之间存在权衡。在比较次数上，基于枢轴的方法明显优于基于聚类的方法，但是它们对空间的要求更高，这阻碍了它们在实际问题中的应用。因此，已经提出了几种策略来减少基于支点的方法所需的空间，如BAESA, FQA或KVP。在本文中，我们分析了轴的有用性取决于它们与目标的接近程度。作为这种分析的结果，我们提出了一种新的基于枢轴的方法，它需要的空间量等于或非常接近基于聚类的方法所需的空间量。我们提供的实验结果表明，当使用相同数量的内存时，我们的建议代表了面向集群的解决方案的竞争策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 Second International Workshop on Similarity Search and Applications

自引率

0.00%

发文量