Coupled dictionary learning and feature mapping for cross-modal retrieval

Xing Xu, Atsushi Shimada, R. Taniguchi, Li He
{"title":"Coupled dictionary learning and feature mapping for cross-modal retrieval","authors":"Xing Xu, Atsushi Shimada, R. Taniguchi, Li He","doi":"10.1109/ICME.2015.7177396","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities into a common subspace, in which cross-modal data matching can be performed. However, the different feature representations and correlation structures of different modalities inhibit these methods from efficiently modeling the relationships across modalities through a common subspace. To handle the diversities of different modalities, we first leverage the coupled dictionary learning method to generate homogeneous sparse representations for different modalities by associating and jointly updating their dictionaries. We then use a coupled feature mapping scheme to project the derived sparse representations from different modalities into a common subspace in which cross-modal retrieval can be performed. Experiments on a variety of cross-modal retrieval tasks demonstrate that the proposed method outperforms the state-of-the-art approaches.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2015.7177396","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

Abstract

In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities into a common subspace, in which cross-modal data matching can be performed. However, the different feature representations and correlation structures of different modalities inhibit these methods from efficiently modeling the relationships across modalities through a common subspace. To handle the diversities of different modalities, we first leverage the coupled dictionary learning method to generate homogeneous sparse representations for different modalities by associating and jointly updating their dictionaries. We then use a coupled feature mapping scheme to project the derived sparse representations from different modalities into a common subspace in which cross-modal retrieval can be performed. Experiments on a variety of cross-modal retrieval tasks demonstrate that the proposed method outperforms the state-of-the-art approaches.
跨模态检索的耦合字典学习和特征映射
在本文中,我们研究了跨模式检索任务(如文本到图像搜索和图像到文本搜索)中图像和相关文本的建模问题。为了使图像和文本模态数据具有可比性,以往的跨模态检索方法直接学习两个投影矩阵,将两模态的原始特征映射到一个共同的子空间中,在该子空间中进行跨模态数据匹配。然而,不同模态的特征表示和关联结构不同,使得这些方法无法通过公共子空间有效地建模模态间的关系。为了处理不同模态的多样性,我们首先利用耦合字典学习方法,通过关联和联合更新不同模态的字典来生成同质的稀疏表示。然后,我们使用耦合特征映射方案将从不同模态导出的稀疏表示投影到可以执行跨模态检索的公共子空间中。在各种跨模态检索任务上的实验表明,该方法优于目前最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信