2009 Second International Workshop on Similarity Search and Applications最新文献

筛选
英文 中文
Combinatorial Framework for Similarity Search 相似性搜索的组合框架
2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-08-29 DOI: 10.1109/SISAP.2009.31
Y. Lifshits
{"title":"Combinatorial Framework for Similarity Search","authors":"Y. Lifshits","doi":"10.1109/SISAP.2009.31","DOIUrl":"https://doi.org/10.1109/SISAP.2009.31","url":null,"abstract":"We present an overview of combinatorial framework for similarity search. An algorithm is combinatorial if only direct comparisons between two pairwise similarity values are allowed. Namely, the input dataset is represented by a comparison oracle that given any three points X,Y,Z answers whether Y or Z is closer to X. We assume that the similarity order of the dataset satisfies the four variations of the following disorder inequality: if X is the A'th most similar object to Y and Y is the B'th most similar object to Z, then X is among the D(A+B) most similar objects to Z, where D is a relatively small disorder constant. Combinatorial algorithms for nearest neighbor search have two important advantages: (1) they do not map similarity values to artificial distance values and do not use triangle inequality for the latter, and (2) they work for arbitrarily complicated data representations and similarity functions. Ranwalk, the first known combinatorial solution for nearest neighbors, is randomized, exact, zero-error algorithm with query time that is logarithmic in number of objects. But Ranwalk preprocessing time is quadratic. Later on, another solution, called combinatorial nets, was discovered. It is deterministic and exact algorithm with almost linear time and space complexity of preprocessing, and near-logarithmic time complexity of search. Combinatorial nets also have a number of side applications. For near-duplicate detection they lead to the first known deterministic algorithm that requires just near-linear time + time proportional to the size of output. For any dataset with small disorder combinatorial nets can be used to construct a visibility graph: the one in which greedy routing deterministically converges to the nearest neighbor of a target in logarithmic number of steps. The later result is the first known work-around for Navarro's impossibility of generalizing Delaunay graphs.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130885304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Tuneable Fuzzy Similarity in Non-metric Search 基于可调模糊相似度的非度量搜索
2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-08-29 DOI: 10.1109/SISAP.2009.18
P. Vojtás, A. Eckhardt
{"title":"Using Tuneable Fuzzy Similarity in Non-metric Search","authors":"P. Vojtás, A. Eckhardt","doi":"10.1109/SISAP.2009.18","DOIUrl":"https://doi.org/10.1109/SISAP.2009.18","url":null,"abstract":"We propose an alternate method for indexing data for answering queries in non-metric spaces. The traditional use of distance and triangle inequality is substituted with the use of fuzzy similarity fulfilling the transitivity property with a tuneable fuzzy conjunctor. In a non-metric space it is still possible that there is a fuzzy conjunctor such that transitivity holds and usual indexing techniques based on pivots for range queries can be applied.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128350587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MiPai: Using the PP-Index to Build an Efficient and Scalable Similarity Search System 米派:利用PP-Index构建高效、可扩展的相似度搜索系统
2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-08-01 DOI: 10.1109/SISAP.2009.14
Andrea Esuli
{"title":"MiPai: Using the PP-Index to Build an Efficient and Scalable Similarity Search System","authors":"Andrea Esuli","doi":"10.1109/SISAP.2009.14","DOIUrl":"https://doi.org/10.1109/SISAP.2009.14","url":null,"abstract":"MiPai is an image search system that provides visual similarity search and text-based search functionalities. The similarity search functionality is implemented by means of the Permutation Prefix Index (PP-Index), a novel data structure for approximate similarity search. The text-based search functionality is based on a traditional inverted list index data structure. MiPai also provides a combined visual similarity/text search function.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123788810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Curse of Dimensionality in Pivot Based Indexes 基于枢轴的索引中的维数诅咒
2009 Second International Workshop on Similarity Search and Applications Pub Date : 2009-06-02 DOI: 10.1109/SISAP.2009.9
I. Volnyansky, V. Pestov
{"title":"Curse of Dimensionality in Pivot Based Indexes","authors":"I. Volnyansky, V. Pestov","doi":"10.1109/SISAP.2009.9","DOIUrl":"https://doi.org/10.1109/SISAP.2009.9","url":null,"abstract":"We offer a theoretical validation of the curse of dimensionality in the pivot-based indexing of datasets for similarity search, by proving, in the framework of statistical learning, that in high dimensions no pivot-based indexing scheme can essentially outperform the linear scan. A study of the asymptotic performance of pivot-based indexing schemes is performed on a sequence of datasets modeled as samples picked in i.i.d. fashion from a sequence of metric spaces. We allow the size of the dataset to grow in relation to dimension, such that the dimension is superlogarithmic but subpolynomial in the size of the dataset. The number of pivots is sublinear in the size of the dataset. We pick the least restrictive cost model of similarity search where we count each distance calculation as a single computation and disregard the rest. We demonstrate that if the intrinsic dimension of the spaces in the sense of concentration of measure phenomenon is linear in dimension, then the performance of similarity search pivot-based indexes is asymptotically linear in the size of the dataset.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129294797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信