Visual transformer-based image retrieval with multiple loss fusion

Huayong Liu, Cong Huang, Hanjun Jin, Xiaosi Fu, Pei Shi
{"title":"Visual transformer-based image retrieval with multiple loss fusion","authors":"Huayong Liu, Cong Huang, Hanjun Jin, Xiaosi Fu, Pei Shi","doi":"10.1117/12.2685738","DOIUrl":null,"url":null,"abstract":"Through hash learning, the image retrieval based on deep hash algorithm encodes the image into a fixed length hash code for fast retrieval and matching. However, previous deep hash retrieval models based on convolutional neural networks extract local information of the image using pooling and convolution technology, which requires deeper networks to obtain long distance dependency, leading to high complexity and computation. In this paper, we propose a visual Transformer model based on self-attention to learn long dependencies of images and enhance the extraction ability of image features. Furthermore, a loss function with multiple loss fusion is proposed, which combines hash contrastive loss, classification loss, and quantization loss, to fully utilize image label information to improve the quality of hash coding by learning more potential semantic information. Experimental results demonstrate the superior performance of the proposed method over multiple classical deep hash retrieval methods based on CNN and two transformer-based hash retrieval methods, on two different datasets and different lengths of hash code.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Electronic Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2685738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Through hash learning, the image retrieval based on deep hash algorithm encodes the image into a fixed length hash code for fast retrieval and matching. However, previous deep hash retrieval models based on convolutional neural networks extract local information of the image using pooling and convolution technology, which requires deeper networks to obtain long distance dependency, leading to high complexity and computation. In this paper, we propose a visual Transformer model based on self-attention to learn long dependencies of images and enhance the extraction ability of image features. Furthermore, a loss function with multiple loss fusion is proposed, which combines hash contrastive loss, classification loss, and quantization loss, to fully utilize image label information to improve the quality of hash coding by learning more potential semantic information. Experimental results demonstrate the superior performance of the proposed method over multiple classical deep hash retrieval methods based on CNN and two transformer-based hash retrieval methods, on two different datasets and different lengths of hash code.
基于视觉变换的多损失融合图像检索
基于深度哈希算法的图像检索通过哈希学习,将图像编码成固定长度的哈希码,便于快速检索和匹配。然而,以往基于卷积神经网络的深度哈希检索模型使用池化和卷积技术提取图像的局部信息,这需要更深层的网络获得长距离依赖,导致复杂度和计算量较高。本文提出了一种基于自注意的视觉Transformer模型来学习图像的长依赖关系,增强图像特征的提取能力。进一步,提出了一种多损失融合的损失函数,将哈希对比损失、分类损失和量化损失相结合,充分利用图像标签信息,通过学习更多潜在的语义信息来提高哈希编码的质量。实验结果表明,在不同的数据集和不同的哈希码长度下,该方法优于基于CNN的多种经典深度哈希检索方法和两种基于变换的哈希检索方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信