Analyzing Metric Space Indexes: What For?

2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-08-29 DOI:10.1109/SISAP.2009.17

G. Navarro

{"title":"Analyzing Metric Space Indexes: What For?","authors":"G. Navarro","doi":"10.1109/SISAP.2009.17","DOIUrl":null,"url":null,"abstract":"It has been a long way since the beginnings of metric space searching, where people coming from algorithmics tried to apply their background to this new paradigm, obtaining variable, but especially difficult to explain, success or lack of it. Since then, some has been learned about the specifics of the problem, in particular regarding key aspects such as the intrinsic dimensionality, that were not well understood in the beginning. The inclusion of those aspects in the picture has led to the most important developments in the area. Similarly, researchers have tried to apply asymptotic analysis concepts to understand and predict the performance of the data structures. Again, it was soon clear that this was insufficient, and that the characteristics of the metric space itself could not be neglected. Although some progress has been made on understanding concepts such as the curse of dimensionality, modern researchers seem to have given up in using asymptotic analysis. They rely on experiments, or at best in detailed cost models that are good predictors but do not explain why the data structures perform in the way they do. In this paper I will argue that this is a big loss. Even if the predictive capability of asymptotic analysis is poor, it constitutes a great tool to understand the algorithmic concepts behind the different data structures, and gives powerful hints in the design of new ones. I will exemplify my view by recollecting what is known on asymptotic analysis of metric indexes, and will add some new results.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Second International Workshop on Similarity Search and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SISAP.2009.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

It has been a long way since the beginnings of metric space searching, where people coming from algorithmics tried to apply their background to this new paradigm, obtaining variable, but especially difficult to explain, success or lack of it. Since then, some has been learned about the specifics of the problem, in particular regarding key aspects such as the intrinsic dimensionality, that were not well understood in the beginning. The inclusion of those aspects in the picture has led to the most important developments in the area. Similarly, researchers have tried to apply asymptotic analysis concepts to understand and predict the performance of the data structures. Again, it was soon clear that this was insufficient, and that the characteristics of the metric space itself could not be neglected. Although some progress has been made on understanding concepts such as the curse of dimensionality, modern researchers seem to have given up in using asymptotic analysis. They rely on experiments, or at best in detailed cost models that are good predictors but do not explain why the data structures perform in the way they do. In this paper I will argue that this is a big loss. Even if the predictive capability of asymptotic analysis is poor, it constitutes a great tool to understand the algorithmic concepts behind the different data structures, and gives powerful hints in the design of new ones. I will exemplify my view by recollecting what is known on asymptotic analysis of metric indexes, and will add some new results.

查看原文本刊更多论文

分析度量空间指数:为什么?

自从度量空间搜索开始以来，已经有很长的路要走了，来自算法的人们试图将他们的背景应用到这个新范式中，获得变量，但特别难以解释，成功与否。从那时起，人们对这个问题的细节有了一些了解，特别是在一些关键方面，比如内在维度，这些在一开始并没有得到很好的理解。将这些方面纳入全局导致了该地区最重要的事态发展。同样，研究人员试图应用渐近分析概念来理解和预测数据结构的性能。同样，很快就发现这是不够的，度量空间本身的特征是不能忽视的。尽管在理解诸如维数诅咒之类的概念方面取得了一些进展，但现代研究人员似乎已经放弃了使用渐近分析。他们依靠实验，或者充其量是详细的成本模型，这些模型是很好的预测者，但不能解释为什么数据结构会以他们的方式运行。在本文中，我将论证这是一个巨大的损失。即使渐近分析的预测能力很差，它也构成了一个很好的工具，可以理解不同数据结构背后的算法概念，并在设计新结构时提供强大的提示。我将通过回顾已知的度量指标的渐近分析来举例说明我的观点，并将添加一些新的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 Second International Workshop on Similarity Search and Applications

自引率

0.00%

发文量