Cross-Modal Deep Variational Hashing

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI:10.1109/ICCV.2017.439

Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, Jie Zhou

引用次数: 78

Abstract

In this paper, we propose a cross-modal deep variational hashing (CMDVH) method for cross-modality multimedia retrieval. Unlike existing cross-modal hashing methods which learn a single pair of projections to map each example as a binary vector, we design a couple of deep neural network to learn non-linear transformations from image-text input pairs, so that unified binary codes can be obtained. We then design the modality-specific neural networks in a probabilistic manner where we model a latent variable as close as possible from the inferred binary codes, which is approximated by a posterior distribution regularized by a known prior. Experimental results on three benchmark datasets show the efficacy of the proposed approach.

查看原文本刊更多论文

跨模态深度变分哈希

本文提出了一种跨模态深度变分哈希(CMDVH)方法用于跨模态多媒体检索。与现有的跨模态哈希方法不同，我们设计了一对深度神经网络来学习图像-文本输入对的非线性变换，从而获得统一的二进制代码。然后，我们以概率方式设计模态特定的神经网络，其中我们根据推断的二进制代码尽可能接近地建模潜在变量，这是由已知先验正则化的后验分布近似的。在三个基准数据集上的实验结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量