Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning

M. Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang
{"title":"Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning","authors":"M. Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang","doi":"10.1145/3503161.3548391","DOIUrl":null,"url":null,"abstract":"Deep cross-media hashing technology provides an efficient cross-media representation learning solution for cross-media search. However, the existing methods do not consider both fine-grained semantic features and semantic structures to mine implicit cross-media semantic associations, which leads to weaker semantic discrimination and consistency for cross-media representation. To tackle this problem, we propose a novel semantic structure enhanced contrastive adversarial hash network for cross-media representation learning (SCAHN). Firstly, in order to capture more fine-grained cross-media semantic associations, a fine-grained cross-media attention feature learning network is constructed, thus the learned saliency features of different modalities are more conducive to cross-media semantic alignment and fusion. Secondly, for further improving learning ability of implicit cross-media semantic associations, a semantic label association graph is constructed, and the graph convolutional network is utilized to mine the implicit semantic structures, thus guiding learning of discriminative features of different modalities. Thirdly, a cross-media and intra-media contrastive adversarial representation learning mechanism is proposed to further enhance the semantic discriminativeness of different modal representations, and a dual-way adversarial learning strategy is developed to maximize cross-media semantic associations, so as to obtain cross-media unified representations with stronger discriminativeness and semantic consistency preserving power. Extensive experiments on several cross-media benchmark datasets demonstrate that the proposed SCAHN outperforms the state-of-the-art methods.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3548391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Deep cross-media hashing technology provides an efficient cross-media representation learning solution for cross-media search. However, the existing methods do not consider both fine-grained semantic features and semantic structures to mine implicit cross-media semantic associations, which leads to weaker semantic discrimination and consistency for cross-media representation. To tackle this problem, we propose a novel semantic structure enhanced contrastive adversarial hash network for cross-media representation learning (SCAHN). Firstly, in order to capture more fine-grained cross-media semantic associations, a fine-grained cross-media attention feature learning network is constructed, thus the learned saliency features of different modalities are more conducive to cross-media semantic alignment and fusion. Secondly, for further improving learning ability of implicit cross-media semantic associations, a semantic label association graph is constructed, and the graph convolutional network is utilized to mine the implicit semantic structures, thus guiding learning of discriminative features of different modalities. Thirdly, a cross-media and intra-media contrastive adversarial representation learning mechanism is proposed to further enhance the semantic discriminativeness of different modal representations, and a dual-way adversarial learning strategy is developed to maximize cross-media semantic associations, so as to obtain cross-media unified representations with stronger discriminativeness and semantic consistency preserving power. Extensive experiments on several cross-media benchmark datasets demonstrate that the proposed SCAHN outperforms the state-of-the-art methods.
跨媒体表示学习的语义结构增强对比对抗哈希网络
深度跨媒体哈希技术为跨媒体搜索提供了一种高效的跨媒体表示学习解决方案。然而,现有的方法没有同时考虑细粒度的语义特征和语义结构来挖掘隐含的跨媒体语义关联,导致跨媒体表示的语义辨别力和一致性较弱。为了解决这个问题,我们提出了一种新的语义结构增强的对比对抗哈希网络,用于跨媒体表示学习(SCAHN)。首先,为了捕获更细粒度的跨媒体语义关联,构建了一个细粒度的跨媒体注意特征学习网络,使学习到的不同模态的显著性特征更有利于跨媒体语义对齐和融合。其次,为进一步提高内隐跨媒体语义关联的学习能力,构建语义标签关联图,利用图卷积网络对内隐语义结构进行挖掘,从而指导不同模态的判别特征学习。再次,提出跨媒体和媒体内对比对抗表征学习机制,进一步增强不同模态表征的语义辨别性,并制定双向对抗学习策略,最大化跨媒体语义关联,从而获得具有更强辨别性和语义一致性保持能力的跨媒体统一表征。在几个跨媒体基准数据集上进行的大量实验表明,所提出的SCAHN优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信