一种新的用于无监督跨模态检索的深度高级概念挖掘连接哈希模型

IF 3 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

High-Confidence Computing Pub Date : 2024-10-29 DOI:10.1016/j.hcc.2024.100274

Chun-Ru Dong , Jun-Yan Zhang , Feng Zhang , Qiang Hua , Dachuan Xu

{"title":"一种新的用于无监督跨模态检索的深度高级概念挖掘连接哈希模型","authors":"Chun-Ru Dong , Jun-Yan Zhang , Feng Zhang , Qiang Hua , Dachuan Xu","doi":"10.1016/j.hcc.2024.100274","DOIUrl":null,"url":null,"abstract":"<div><div>Unsupervised cross-modal hashing has achieved great success in various information retrieval applications owing to its efficient storage usage and fast retrieval speed. Recent studies have primarily focused on training the hash-encoded networks by calculating a sample-based similarity matrix to improve the retrieval performance. However, there are two issues remain to solve: (1) The current sample-based similarity matrix only considers the similarity between image-text pairs, ignoring the different information densities of each modality, which may introduce additional noise and fail to mine key information for retrieval; (2) Most existing unsupervised cross-modal hashing methods only consider alignment between different modalities, while ignoring consistency between each modality, resulting in semantic conflicts. To tackle these challenges, a novel Deep High-level Concept-mining Jointing Hashing (DHCJH) model for unsupervised cross-modal retrieval is proposed in this study. DHCJH is able to capture the essential high-level semantic information from image modalities and integrate into the text modalities to improve the accuracy of guidance information. Additionally, a new hashing loss with a regularization term is introduced to avoid the cross-modal semantic collision and false positive pairs problems. To validate the proposed method, extensive comparison experiments on benchmark datasets are conducted. Experimental findings reveal that DHCJH achieves superior performance in both accuracy and efficiency. The code of DHCJH is available at Github.</div></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"5 2","pages":"Article 100274"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel deep high-level concept-mining jointing hashing model for unsupervised cross-modal retrieval\",\"authors\":\"Chun-Ru Dong , Jun-Yan Zhang , Feng Zhang , Qiang Hua , Dachuan Xu\",\"doi\":\"10.1016/j.hcc.2024.100274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Unsupervised cross-modal hashing has achieved great success in various information retrieval applications owing to its efficient storage usage and fast retrieval speed. Recent studies have primarily focused on training the hash-encoded networks by calculating a sample-based similarity matrix to improve the retrieval performance. However, there are two issues remain to solve: (1) The current sample-based similarity matrix only considers the similarity between image-text pairs, ignoring the different information densities of each modality, which may introduce additional noise and fail to mine key information for retrieval; (2) Most existing unsupervised cross-modal hashing methods only consider alignment between different modalities, while ignoring consistency between each modality, resulting in semantic conflicts. To tackle these challenges, a novel Deep High-level Concept-mining Jointing Hashing (DHCJH) model for unsupervised cross-modal retrieval is proposed in this study. DHCJH is able to capture the essential high-level semantic information from image modalities and integrate into the text modalities to improve the accuracy of guidance information. Additionally, a new hashing loss with a regularization term is introduced to avoid the cross-modal semantic collision and false positive pairs problems. To validate the proposed method, extensive comparison experiments on benchmark datasets are conducted. Experimental findings reveal that DHCJH achieves superior performance in both accuracy and efficiency. The code of DHCJH is available at Github.</div></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"5 2\",\"pages\":\"Article 100274\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295224000771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295224000771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

无监督跨模态哈希以其高效的存储利用率和快速的检索速度在各种信息检索应用中取得了巨大的成功。最近的研究主要集中在通过计算基于样本的相似性矩阵来训练哈希编码网络，以提高检索性能。然而，目前基于样本的相似度矩阵只考虑图像-文本对之间的相似度，忽略了每种模态的不同信息密度，这可能会引入额外的噪声，无法挖掘关键信息进行检索；(2)现有的大多数无监督跨模态哈希方法只考虑了不同模态之间的对齐，而忽略了各模态之间的一致性，导致语义冲突。为了解决这些问题，本研究提出了一种新的用于无监督跨模态检索的深度高级概念挖掘连接哈希（DHCJH）模型。DHCJH能够从图像模态中捕获重要的高级语义信息，并将其整合到文本模态中，以提高制导信息的准确性。此外，引入了一种新的带正则化项的哈希损失，避免了跨模态语义冲突和假正对问题。为了验证所提出的方法，在基准数据集上进行了大量的对比实验。实验结果表明，DHCJH在准确率和效率方面都取得了优异的成绩。DHCJH的代码可在Github上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel deep high-level concept-mining jointing hashing model for unsupervised cross-modal retrieval

Unsupervised cross-modal hashing has achieved great success in various information retrieval applications owing to its efficient storage usage and fast retrieval speed. Recent studies have primarily focused on training the hash-encoded networks by calculating a sample-based similarity matrix to improve the retrieval performance. However, there are two issues remain to solve: (1) The current sample-based similarity matrix only considers the similarity between image-text pairs, ignoring the different information densities of each modality, which may introduce additional noise and fail to mine key information for retrieval; (2) Most existing unsupervised cross-modal hashing methods only consider alignment between different modalities, while ignoring consistency between each modality, resulting in semantic conflicts. To tackle these challenges, a novel Deep High-level Concept-mining Jointing Hashing (DHCJH) model for unsupervised cross-modal retrieval is proposed in this study. DHCJH is able to capture the essential high-level semantic information from image modalities and integrate into the text modalities to improve the accuracy of guidance information. Additionally, a new hashing loss with a regularization term is introduced to avoid the cross-modal semantic collision and false positive pairs problems. To validate the proposed method, extensive comparison experiments on benchmark datasets are conducted. Experimental findings reveal that DHCJH achieves superior performance in both accuracy and efficiency. The code of DHCJH is available at Github.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

High-Confidence Computing

CiteScore

4.70

自引率

0.00%

发文量