Joint-modal Distribution-based Similarity Hashing for Large-scale Unsupervised Deep Cross-modal Retrieval

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2020-07-25 DOI:10.1145/3397271.3401086

Song Liu, Shengsheng Qian, Yang Guan, Jiawei Zhan, Long Ying

{"title":"Joint-modal Distribution-based Similarity Hashing for Large-scale Unsupervised Deep Cross-modal Retrieval","authors":"Song Liu, Shengsheng Qian, Yang Guan, Jiawei Zhan, Long Ying","doi":"10.1145/3397271.3401086","DOIUrl":null,"url":null,"abstract":"Hashing-based cross-modal search which aims to map multiple modality features into binary codes has attracted increasingly attention due to its storage and search efficiency especially in large-scale database retrieval. Recent unsupervised deep cross-modal hashing methods have shown promising results. However, existing approaches typically suffer from two limitations: (1) They usually learn cross-modal similarity information separately or in a redundant fusion manner, which may fail to capture semantic correlations among instances from different modalities sufficiently and effectively. (2) They seldom consider the sampling and weighting schemes for unsupervised cross-modal hashing, resulting in the lack of satisfactory discriminative ability in hash codes. To overcome these limitations, we propose a novel unsupervised deep cross-modal hashing method called Joint-modal Distribution-based Similarity Hashing (JDSH) for large-scale cross-modal retrieval. Firstly, we propose a novel cross-modal joint-training method by constructing a joint-modal similarity matrix to fully preserve the cross-modal semantic correlations among instances. Secondly, we propose a sampling and weighting scheme termed the Distribution-based Similarity Decision and Weighting (DSDW) method for unsupervised cross-modal hashing, which is able to generate more discriminative hash codes by pushing semantic similar instance pairs closer and pulling semantic dissimilar instance pairs apart. The experimental results demonstrate the superiority of JDSH compared with several unsupervised cross-modal hashing methods on two public datasets NUS-WIDE and MIRFlickr.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"70","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 70

Abstract

Hashing-based cross-modal search which aims to map multiple modality features into binary codes has attracted increasingly attention due to its storage and search efficiency especially in large-scale database retrieval. Recent unsupervised deep cross-modal hashing methods have shown promising results. However, existing approaches typically suffer from two limitations: (1) They usually learn cross-modal similarity information separately or in a redundant fusion manner, which may fail to capture semantic correlations among instances from different modalities sufficiently and effectively. (2) They seldom consider the sampling and weighting schemes for unsupervised cross-modal hashing, resulting in the lack of satisfactory discriminative ability in hash codes. To overcome these limitations, we propose a novel unsupervised deep cross-modal hashing method called Joint-modal Distribution-based Similarity Hashing (JDSH) for large-scale cross-modal retrieval. Firstly, we propose a novel cross-modal joint-training method by constructing a joint-modal similarity matrix to fully preserve the cross-modal semantic correlations among instances. Secondly, we propose a sampling and weighting scheme termed the Distribution-based Similarity Decision and Weighting (DSDW) method for unsupervised cross-modal hashing, which is able to generate more discriminative hash codes by pushing semantic similar instance pairs closer and pulling semantic dissimilar instance pairs apart. The experimental results demonstrate the superiority of JDSH compared with several unsupervised cross-modal hashing methods on two public datasets NUS-WIDE and MIRFlickr.

查看原文本刊更多论文

基于联合模态分布的大规模无监督深度跨模态检索相似性哈希

基于哈希的跨模态搜索以多模态特征映射到二进制码中为目标，其存储和搜索效率越来越受到人们的关注，特别是在大规模数据库检索中。最近的无监督深度跨模态哈希方法已经显示出有希望的结果。然而，现有的方法通常存在两个局限性:(1)它们通常单独或以冗余融合的方式学习跨模态相似性信息，可能无法充分有效地捕获不同模态实例之间的语义相关性。(2)对于无监督跨模态哈希，他们很少考虑采样和加权方案，导致哈希码缺乏令人满意的判别能力。为了克服这些限制，我们提出了一种新的无监督深度跨模态哈希方法，称为基于联合模态分布的相似性哈希(JDSH)，用于大规模跨模态检索。首先，我们提出了一种新的跨模态联合训练方法，通过构造一个联合模态相似矩阵来充分保持实例间的跨模态语义相关性。其次，针对无监督跨模态哈希，提出了一种基于分布的相似性决策和加权(DSDW)方法，该方法通过将语义相似的实例对推得更近，将语义不相似的实例对拉得更远，从而产生更多的判别哈希码。实验结果表明，在NUS-WIDE和MIRFlickr两个公共数据集上，JDSH比几种无监督跨模态哈希方法更具有优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量