Cross-modal subspace learning for sketch-based image retrieval: A comparative study

2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC) Pub Date : 2016-09-01 DOI:10.1109/ICNIDC.2016.7974625

Peng Xu, Ke Li, Zhanyu Ma, Yi-Zhe Song, Liang Wang, Jun Guo

{"title":"Cross-modal subspace learning for sketch-based image retrieval: A comparative study","authors":"Peng Xu, Ke Li, Zhanyu Ma, Yi-Zhe Song, Liang Wang, Jun Guo","doi":"10.1109/ICNIDC.2016.7974625","DOIUrl":null,"url":null,"abstract":"Sketch-based image retrieval (SBIR) has become a prominent research topic in recent years due to the proliferation of touch screens. The problem is however very challenging for that photos and sketches are inherently modeled in different modalities. Photos are accurate (colored and textured) depictions of the real-world, whereas sketches are highly abstract (black and white) renderings often drawn from human memory. This naturally motivates us to study the effectiveness of various cross-modal retrieval methods in SBIR. However, to the best of our knowledge, all established cross-modal algorithms are designed to traverse the more conventional cross-modal gap of image and text, making their general applicableness to SBIR unclear. In this paper, we design a series of experiments to clearly illustrate circumstances under which cross-modal methods can be best utilized to solve the SBIR problem. More specifically, we choose six state-of-the-art cross-modal subspace learning approaches that were shown to work well on image-text and conduct extensive experiments on a recently released SBIR dataset. Finally, we present detailed comparative analysis of the experimental results and offer insights to benefit future research.","PeriodicalId":439987,"journal":{"name":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNIDC.2016.7974625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

Abstract

Sketch-based image retrieval (SBIR) has become a prominent research topic in recent years due to the proliferation of touch screens. The problem is however very challenging for that photos and sketches are inherently modeled in different modalities. Photos are accurate (colored and textured) depictions of the real-world, whereas sketches are highly abstract (black and white) renderings often drawn from human memory. This naturally motivates us to study the effectiveness of various cross-modal retrieval methods in SBIR. However, to the best of our knowledge, all established cross-modal algorithms are designed to traverse the more conventional cross-modal gap of image and text, making their general applicableness to SBIR unclear. In this paper, we design a series of experiments to clearly illustrate circumstances under which cross-modal methods can be best utilized to solve the SBIR problem. More specifically, we choose six state-of-the-art cross-modal subspace learning approaches that were shown to work well on image-text and conduct extensive experiments on a recently released SBIR dataset. Finally, we present detailed comparative analysis of the experimental results and offer insights to benefit future research.

查看原文本刊更多论文

基于草图的图像检索的跨模态子空间学习:比较研究

近年来，由于触摸屏的普及，基于草图的图像检索(SBIR)成为一个突出的研究课题。然而，这个问题非常具有挑战性，因为照片和草图本身就是以不同的方式建模的。照片是对现实世界的准确(彩色和纹理)描述，而草图是高度抽象的(黑白)渲染，通常是从人类记忆中绘制的。这自然促使我们研究各种跨模态检索方法在SBIR中的有效性。然而，据我们所知，所有已建立的跨模态算法都是为了遍历更传统的图像和文本的跨模态间隙而设计的，这使得它们对SBIR的普遍适用性不明确。在本文中，我们设计了一系列实验来清楚地说明跨模态方法可以最好地用于解决SBIR问题的情况。更具体地说，我们选择了六种最先进的跨模态子空间学习方法，这些方法在图像-文本上表现良好，并在最近发布的SBIR数据集上进行了广泛的实验。最后，我们对实验结果进行了详细的对比分析，并为未来的研究提供了有益的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)

自引率

0.00%

发文量