Towards Human Performance on Sketch-Based Image Retrieval

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI:10.1145/3549555.3549582

Omar Seddati, S. Dupont, S. Mahmoudi, T. Dutoit

{"title":"Towards Human Performance on Sketch-Based Image Retrieval","authors":"Omar Seddati, S. Dupont, S. Mahmoudi, T. Dutoit","doi":"10.1145/3549555.3549582","DOIUrl":null,"url":null,"abstract":"Sketch-based image retrieval (SBIR) solutions are attracting increased interest in the field of computer vision. These solutions provide an intuitive and powerful tool to retrieve images in large-scale image databases. In this paper, we conduct a comprehensive study of classic triplet CNN training pipelines within the SBIR context. We study the impact of embeddings normalization, model sharing, margin selection, batch size, hard mining selection and the evolution of the number of hard triplets during training to propose several avenues for improvement. We also propose dropout column, an adaptation of dropout for triplet network and similar pipelines. In addition, we also introduce a novel approach to build state-of-the-art SBIR solutions that can be used with low power systems. The whole study is conducted using The Sketchy Database, a large-scale SBIR database. We carry out a series of experiments and show that adopting a few simple modifications enhances significantly existing SBIR pipelines (faster training & higher accuracy). Our study enables us to propose an enhanced pipeline that outperforms previous state-of-the-art on the Sketchy Database by a significant margin (a recall of 53.92% compared to 46.2% at k = 1) and reaches almost human performance (54.27%) on a large-scale benchmark.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3549555.3549582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Sketch-based image retrieval (SBIR) solutions are attracting increased interest in the field of computer vision. These solutions provide an intuitive and powerful tool to retrieve images in large-scale image databases. In this paper, we conduct a comprehensive study of classic triplet CNN training pipelines within the SBIR context. We study the impact of embeddings normalization, model sharing, margin selection, batch size, hard mining selection and the evolution of the number of hard triplets during training to propose several avenues for improvement. We also propose dropout column, an adaptation of dropout for triplet network and similar pipelines. In addition, we also introduce a novel approach to build state-of-the-art SBIR solutions that can be used with low power systems. The whole study is conducted using The Sketchy Database, a large-scale SBIR database. We carry out a series of experiments and show that adopting a few simple modifications enhances significantly existing SBIR pipelines (faster training & higher accuracy). Our study enables us to propose an enhanced pipeline that outperforms previous state-of-the-art on the Sketchy Database by a significant margin (a recall of 53.92% compared to 46.2% at k = 1) and reaches almost human performance (54.27%) on a large-scale benchmark.

查看原文本刊更多论文

基于草图的图像检索的人类性能研究

基于草图的图像检索(SBIR)解决方案在计算机视觉领域引起了越来越多的兴趣。这些解决方案为大规模图像数据库中的图像检索提供了一个直观而强大的工具。在本文中，我们在SBIR背景下对经典的三联体CNN训练管道进行了全面的研究。我们研究了嵌入归一化、模型共享、余量选择、批大小、硬挖掘选择和训练过程中硬三元组数量的演变的影响，提出了几个改进的途径。我们还提出了dropout列，这是对三重网络和类似管道的dropout的一种适应。此外，我们还介绍了一种新的方法来构建可用于低功耗系统的最先进的SBIR解决方案。整个研究使用的是一个大型的SBIR数据库——The sketch Database。我们进行了一系列的实验，结果表明，采用一些简单的修改可以显著增强现有的SBIR管道(更快的训练和更高的精度)。我们的研究使我们能够提出一个增强的管道，该管道在Sketchy数据库上的性能明显优于以前的最先进技术(召回率为53.92%，而k = 1时为46.2%)，并且在大规模基准测试中几乎达到了人类的性能(54.27%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 19th International Conference on Content-based Multimedia Indexing

自引率

0.00%

发文量