{"title":"A case-study of scoring schemes for the PvS-index","authors":"Herwig Lejsek","doi":"10.1145/1160939.1160953","DOIUrl":null,"url":null,"abstract":"Recently we have proposed a new indexing method for high-dimensional data, the PvS-index. It provides fast query processing in constant time and is well suited for doing similarity search in Image Retrieval Systems using local descriptors. It is based on projecting data points onto random lines and uses this information to segment them into appropriately sized buckets, which can be read in just one I/O operation. After this preprocessing step the search queries just three buckets per query descriptor and uses a recent rank aggregation method, OMEDRANK, in order to provide good approximate results for the nearest neighbour problem.We have recently shown that PvS-indexing works well for large collections of real image data. In that work, however, we used a simple scoring scheme and collected few nearest neighbours for each query descriptor. In this study we examine how much the actual number of nearest neighbours, gathered for each local descriptor, influences the final query result, when searching a PvS-index. Based on the results we propose two new alternative scoring schemes, which improve the retrieval quality and stabilise the results, making the search less affected by the actual number of nearest neighbours accumulated.","PeriodicalId":346313,"journal":{"name":"Computer Vision meets Databases","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision meets Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1160939.1160953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Recently we have proposed a new indexing method for high-dimensional data, the PvS-index. It provides fast query processing in constant time and is well suited for doing similarity search in Image Retrieval Systems using local descriptors. It is based on projecting data points onto random lines and uses this information to segment them into appropriately sized buckets, which can be read in just one I/O operation. After this preprocessing step the search queries just three buckets per query descriptor and uses a recent rank aggregation method, OMEDRANK, in order to provide good approximate results for the nearest neighbour problem.We have recently shown that PvS-indexing works well for large collections of real image data. In that work, however, we used a simple scoring scheme and collected few nearest neighbours for each query descriptor. In this study we examine how much the actual number of nearest neighbours, gathered for each local descriptor, influences the final query result, when searching a PvS-index. Based on the results we propose two new alternative scoring schemes, which improve the retrieval quality and stabilise the results, making the search less affected by the actual number of nearest neighbours accumulated.