基于草图的图像检索四重网络

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI:10.1145/3078971.3078985

Omar Seddati, S. Dupont, S. Mahmoudi

{"title":"基于草图的图像检索四重网络","authors":"Omar Seddati, S. Dupont, S. Mahmoudi","doi":"10.1145/3078971.3078985","DOIUrl":null,"url":null,"abstract":"Freehand sketches are a simple and powerful tool for communication. They are easily recognized across cultures and suitable for various applications. In this paper, we use deep convolutional neural networks (ConvNets) to address sketch-based image retrieval (SBIR). We first train our ConvNets on sketch and image object recognition in a large scale benchmark for SBIR (the sketchy database). We then conduct a comprehensive study of ConvNets features for SBIR, using a kNN similarity search paradigm in the ConvNet feature space. In contrast to recent SBIR works, we propose a new architecture the quadruplet networks which enhance ConvNet features for SBIR. This new architecture enables ConvNets to extract more robust global and local features. We evaluate our approach on three large scale datasets. Our quadruplet networks outperform previous state-of-the-art on two of them by a significant margin and gives competitive results on the third. Our system achieves a recall of 42.16% (at k=1) for the sketchy database (more than 5% improvement), a Kendal score of 43.28Τb on the TU-Berlin SBIR benchmark (close to 6Τb improvement) and a mean average precision (MAP) of 32.16% on Flickr15k (a category level SBIR benchmark).","PeriodicalId":403556,"journal":{"name":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"Quadruplet Networks for Sketch-Based Image Retrieval\",\"authors\":\"Omar Seddati, S. Dupont, S. Mahmoudi\",\"doi\":\"10.1145/3078971.3078985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Freehand sketches are a simple and powerful tool for communication. They are easily recognized across cultures and suitable for various applications. In this paper, we use deep convolutional neural networks (ConvNets) to address sketch-based image retrieval (SBIR). We first train our ConvNets on sketch and image object recognition in a large scale benchmark for SBIR (the sketchy database). We then conduct a comprehensive study of ConvNets features for SBIR, using a kNN similarity search paradigm in the ConvNet feature space. In contrast to recent SBIR works, we propose a new architecture the quadruplet networks which enhance ConvNet features for SBIR. This new architecture enables ConvNets to extract more robust global and local features. We evaluate our approach on three large scale datasets. Our quadruplet networks outperform previous state-of-the-art on two of them by a significant margin and gives competitive results on the third. Our system achieves a recall of 42.16% (at k=1) for the sketchy database (more than 5% improvement), a Kendal score of 43.28Τb on the TU-Berlin SBIR benchmark (close to 6Τb improvement) and a mean average precision (MAP) of 32.16% on Flickr15k (a category level SBIR benchmark).\",\"PeriodicalId\":403556,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3078971.3078985\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078971.3078985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

手绘草图是一种简单而强大的沟通工具。它们很容易在不同文化中被识别，并且适用于各种应用程序。在本文中，我们使用深度卷积神经网络(ConvNets)来解决基于草图的图像检索(SBIR)。我们首先在SBIR(粗略数据库)的大规模基准测试中训练我们的卷积神经网络进行草图和图像对象识别。然后，我们在ConvNet特征空间中使用kNN相似度搜索范式，对SBIR的ConvNets特征进行了全面的研究。与最近的SBIR研究相比，我们提出了一种新的四重网络架构，增强了SBIR的卷积神经网络特征。这种新架构使卷积神经网络能够提取更健壮的全局和局部特征。我们在三个大型数据集上评估了我们的方法。我们的四重网络在其中两个方面的表现明显优于先前的最先进技术，并在第三个方面提供了具有竞争力的结果。我们的系统在粗略数据库中实现了42.16% (k=1时)的召回率(提高了5%以上)，在TU-Berlin SBIR基准上的肯德尔得分为43.28Τb(接近6Τb)，在Flickr15k(类别级SBIR基准)上的平均精度(MAP)为32.16%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Quadruplet Networks for Sketch-Based Image Retrieval

Freehand sketches are a simple and powerful tool for communication. They are easily recognized across cultures and suitable for various applications. In this paper, we use deep convolutional neural networks (ConvNets) to address sketch-based image retrieval (SBIR). We first train our ConvNets on sketch and image object recognition in a large scale benchmark for SBIR (the sketchy database). We then conduct a comprehensive study of ConvNets features for SBIR, using a kNN similarity search paradigm in the ConvNet feature space. In contrast to recent SBIR works, we propose a new architecture the quadruplet networks which enhance ConvNet features for SBIR. This new architecture enables ConvNets to extract more robust global and local features. We evaluate our approach on three large scale datasets. Our quadruplet networks outperform previous state-of-the-art on two of them by a significant margin and gives competitive results on the third. Our system achieves a recall of 42.16% (at k=1) for the sketchy database (more than 5% improvement), a Kendal score of 43.28Τb on the TU-Berlin SBIR benchmark (close to 6Τb improvement) and a mean average precision (MAP) of 32.16% on Flickr15k (a category level SBIR benchmark).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

自引率

0.00%

发文量