基于代理的跨模态检索图卷积哈希算法

IF 7.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2023-12-04 DOI:10.1109/TBDATA.2023.3338951

Yibing Bai;Zhenqiu Shu;Jun Yu;Zhengtao Yu;Xiao-Jun Wu

{"title":"基于代理的跨模态检索图卷积哈希算法","authors":"Yibing Bai;Zhenqiu Shu;Jun Yu;Zhengtao Yu;Xiao-Jun Wu","doi":"10.1109/TBDATA.2023.3338951","DOIUrl":null,"url":null,"abstract":"Cross-modal hashing retrieval approaches have received extensive attention owing to their storage superiority and retrieval efficiency. To achieve better retrieval performances, hashing methods seek to embed more semantic information of multi-modal data into hash codes. Existing deep cross-modal hashing methods typically learn hash functions from the similarity of paired data to generate hash codes. However, such locally-oriented learning methods often suffer from low efficiency and incomplete acquisition of semantic information. To address these challenges, this paper presents a novel deep hashing approach, called Proxy-based Graph Convolutional Hashing (PGCH), for cross-modal retrieval. Specifically, we use global similarity to construct proxy hash codes for two different modalities. This strategy of these proxy hash codes ensures that they include data points with significant distribution differences. It helps to match data from different modalities to different proxy hash codes, which can capture the global similarity of multi-modal hash codes and improve the efficiency of hash code learning. Subsequently, we employ a multi-modal contrastive loss to learn the global similarity. Furthermore, by constructing a proxy hash matrix from the proxy hash codes, we apply graph convolution to efficiently narrow the gap between different modalities, leading to a substantial improvement in retrieval performance for cross-modal retrieval tasks. The comprehensive experiments on four benchmark multimedia datasets demonstrate that our PGCH approach achieves better retrieval performances than a bundle of state-of-the-art hashing approaches.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"371-385"},"PeriodicalIF":7.5000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Proxy-Based Graph Convolutional Hashing for Cross-Modal Retrieval\",\"authors\":\"Yibing Bai;Zhenqiu Shu;Jun Yu;Zhengtao Yu;Xiao-Jun Wu\",\"doi\":\"10.1109/TBDATA.2023.3338951\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-modal hashing retrieval approaches have received extensive attention owing to their storage superiority and retrieval efficiency. To achieve better retrieval performances, hashing methods seek to embed more semantic information of multi-modal data into hash codes. Existing deep cross-modal hashing methods typically learn hash functions from the similarity of paired data to generate hash codes. However, such locally-oriented learning methods often suffer from low efficiency and incomplete acquisition of semantic information. To address these challenges, this paper presents a novel deep hashing approach, called Proxy-based Graph Convolutional Hashing (PGCH), for cross-modal retrieval. Specifically, we use global similarity to construct proxy hash codes for two different modalities. This strategy of these proxy hash codes ensures that they include data points with significant distribution differences. It helps to match data from different modalities to different proxy hash codes, which can capture the global similarity of multi-modal hash codes and improve the efficiency of hash code learning. Subsequently, we employ a multi-modal contrastive loss to learn the global similarity. Furthermore, by constructing a proxy hash matrix from the proxy hash codes, we apply graph convolution to efficiently narrow the gap between different modalities, leading to a substantial improvement in retrieval performance for cross-modal retrieval tasks. The comprehensive experiments on four benchmark multimedia datasets demonstrate that our PGCH approach achieves better retrieval performances than a bundle of state-of-the-art hashing approaches.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"10 4\",\"pages\":\"371-385\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2023-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10339853/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10339853/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

跨模态散列检索方法因其存储优势和检索效率而受到广泛关注。为了实现更好的检索性能，散列方法试图将多模态数据的更多语义信息嵌入散列代码中。现有的深度跨模态哈希方法通常是从配对数据的相似性中学习哈希函数来生成哈希代码。然而，这种面向局部的学习方法往往存在效率低、语义信息获取不完整等问题。为了应对这些挑战，本文提出了一种用于跨模态检索的新型深度散列方法，称为基于代理的图卷积散列（PGCH）。具体来说，我们利用全局相似性来构建两种不同模态的代理散列码。这些代理散列码的这种策略可确保它们包含具有显著分布差异的数据点。这有助于将不同模态的数据匹配到不同的代理哈希码，从而捕捉到多模态哈希码的全局相似性，提高哈希码学习的效率。随后，我们采用多模态对比损失来学习全局相似性。此外，通过从代理哈希码构建代理哈希矩阵，我们应用图卷积来有效缩小不同模态之间的差距，从而大幅提高跨模态检索任务的检索性能。在四个基准多媒体数据集上进行的综合实验表明，我们的 PGCH 方法比一系列最先进的散列方法取得了更好的检索性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Proxy-Based Graph Convolutional Hashing for Cross-Modal Retrieval

Cross-modal hashing retrieval approaches have received extensive attention owing to their storage superiority and retrieval efficiency. To achieve better retrieval performances, hashing methods seek to embed more semantic information of multi-modal data into hash codes. Existing deep cross-modal hashing methods typically learn hash functions from the similarity of paired data to generate hash codes. However, such locally-oriented learning methods often suffer from low efficiency and incomplete acquisition of semantic information. To address these challenges, this paper presents a novel deep hashing approach, called Proxy-based Graph Convolutional Hashing (PGCH), for cross-modal retrieval. Specifically, we use global similarity to construct proxy hash codes for two different modalities. This strategy of these proxy hash codes ensures that they include data points with significant distribution differences. It helps to match data from different modalities to different proxy hash codes, which can capture the global similarity of multi-modal hash codes and improve the efficiency of hash code learning. Subsequently, we employ a multi-modal contrastive loss to learn the global similarity. Furthermore, by constructing a proxy hash matrix from the proxy hash codes, we apply graph convolution to efficiently narrow the gap between different modalities, leading to a substantial improvement in retrieval performance for cross-modal retrieval tasks. The comprehensive experiments on four benchmark multimedia datasets demonstrate that our PGCH approach achieves better retrieval performances than a bundle of state-of-the-art hashing approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.