基于大词汇量和快速空间匹配的对象检索

2007 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2007-06-17 DOI:10.1109/CVPR.2007.383172

James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman

{"title":"基于大词汇量和快速空间匹配的对象检索","authors":"James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman","doi":"10.1109/CVPR.2007.383172","DOIUrl":null,"url":null,"abstract":"In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, \"web-scale \" image corpora.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3111","resultStr":"{\"title\":\"Object retrieval with large vocabularies and fast spatial matching\",\"authors\":\"James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman\",\"doi\":\"10.1109/CVPR.2007.383172\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, \\\"web-scale \\\" image corpora.\",\"PeriodicalId\":351008,\"journal\":{\"name\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3111\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2007.383172\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3111

摘要

本文提出了一个大规模的目标检索系统。用户通过选择查询图像的一个区域来提供查询对象，系统返回包含相同对象的图像的排序列表，这些图像是从一个大型语料库中检索到的。我们使用牛津地标作为查询，在从照片共享网站Flickr[3]抓取的超过100万张图像的数据集上演示了我们系统的可扩展性和性能。由于数据集的大小，构建图像特征词汇表是一个主要的时间和性能瓶颈。为了解决这个问题，我们比较了构建词汇表的不同可扩展方法，并引入了一种基于随机树的新型量化方法，我们证明该方法在广泛的基础上优于当前最先进的方法。实验表明，量化对检索质量有重要影响。为了进一步提高查询性能，我们添加了一个有效的空间验证阶段来重新排序从词袋模型返回的结果，并表明这始终提高了搜索质量，尽管当视觉词汇量很大时，改进幅度较小。我们认为这项工作是朝着更大的“网络规模”图像语料库迈出的有希望的一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Object retrieval with large vocabularies and fast spatial matching

In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, "web-scale " image corpora.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量