CNN与SIFT图像检索:替代还是互补?

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI:10.1145/2964284.2967252

Ke Yan, Yaowei Wang, Dawei Liang, Tiejun Huang, Yonghong Tian

{"title":"CNN与SIFT图像检索:替代还是互补?","authors":"Ke Yan, Yaowei Wang, Dawei Liang, Tiejun Huang, Yonghong Tian","doi":"10.1145/2964284.2967252","DOIUrl":null,"url":null,"abstract":"In the past decade, SIFT is widely used in most vision tasks such as image retrieval. While in recent several years, deep convolutional neural networks (CNN) features achieve the state-of-the-art performance in several tasks such as image classification and object detection. Thus a natural question arises: for the image retrieval task, can CNN features substitute for SIFT? In this paper, we experimentally demonstrate that the two kinds of features are highly complementary. Following this fact, we propose an image representation model, complementary CNN and SIFT (CCS), to fuse CNN and SIFT in a multi-level and complementary way. In particular, it can be used to simultaneously describe scene-level, object-level and point-level contents in images. Extensive experiments are conducted on four image retrieval benchmarks, and the experimental results show that our CCS achieves state-of-the-art retrieval results.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":"{\"title\":\"CNN vs. SIFT for Image Retrieval: Alternative or Complementary?\",\"authors\":\"Ke Yan, Yaowei Wang, Dawei Liang, Tiejun Huang, Yonghong Tian\",\"doi\":\"10.1145/2964284.2967252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past decade, SIFT is widely used in most vision tasks such as image retrieval. While in recent several years, deep convolutional neural networks (CNN) features achieve the state-of-the-art performance in several tasks such as image classification and object detection. Thus a natural question arises: for the image retrieval task, can CNN features substitute for SIFT? In this paper, we experimentally demonstrate that the two kinds of features are highly complementary. Following this fact, we propose an image representation model, complementary CNN and SIFT (CCS), to fuse CNN and SIFT in a multi-level and complementary way. In particular, it can be used to simultaneously describe scene-level, object-level and point-level contents in images. Extensive experiments are conducted on four image retrieval benchmarks, and the experimental results show that our CCS achieves state-of-the-art retrieval results.\",\"PeriodicalId\":140670,\"journal\":{\"name\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"56\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2964284.2967252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2967252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 56

摘要

近年来，SIFT在图像检索等视觉任务中得到了广泛的应用。而近年来，深度卷积神经网络(CNN)特征在图像分类和目标检测等任务中取得了最先进的性能。因此，一个自然的问题出现了:对于图像检索任务，CNN特征可以代替SIFT吗?在本文中，我们通过实验证明了这两种特征是高度互补的。基于此，我们提出了一种图像表示模型，即互补CNN和SIFT (CCS)，将CNN和SIFT进行多层次互补融合。特别是，它可以同时描述图像中的场景级、对象级和点级内容。在四个图像检索基准上进行了大量的实验，实验结果表明我们的CCS达到了最先进的检索结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CNN vs. SIFT for Image Retrieval: Alternative or Complementary?

In the past decade, SIFT is widely used in most vision tasks such as image retrieval. While in recent several years, deep convolutional neural networks (CNN) features achieve the state-of-the-art performance in several tasks such as image classification and object detection. Thus a natural question arises: for the image retrieval task, can CNN features substitute for SIFT? In this paper, we experimentally demonstrate that the two kinds of features are highly complementary. Following this fact, we propose an image representation model, complementary CNN and SIFT (CCS), to fuse CNN and SIFT in a multi-level and complementary way. In particular, it can be used to simultaneously describe scene-level, object-level and point-level contents in images. Extensive experiments are conducted on four image retrieval benchmarks, and the experimental results show that our CCS achieves state-of-the-art retrieval results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 24th ACM international conference on Multimedia

自引率

0.00%

发文量