Large-scale Multi-label Image Retrieval Using Residual Network with Hash Layer

Baohua Qiang, Peiyao Wang, Shui-ping Guo, Zhi Xu, Wu Xie, Jinlong Chen, Xianjun Chen
{"title":"Large-scale Multi-label Image Retrieval Using Residual Network with Hash Layer","authors":"Baohua Qiang, Peiyao Wang, Shui-ping Guo, Zhi Xu, Wu Xie, Jinlong Chen, Xianjun Chen","doi":"10.1109/ICACI.2019.8778549","DOIUrl":null,"url":null,"abstract":"In recent years, increasing deep hashing methods have been applied in large-scale multi-label image retrieval. However, in the existing deep network models, the extracted low-level features cannot effectively integrate the multi-level semantic information and the similarity ranking information of pairwise multi-label images into one hash coding learning scheme. Therefore, we cannot obtain an efficient and accurate index method. Motivated by this, in this paper, we proposed a novel approach adopting the cosine distance of pairwise multi-label images semantic vector to quantify existing multi-level similarity in a multi-label image. Meanwhile, we utilized the residual network to learn the final representation of multi-label images features. Finally, we constructed a deep hashing framework to extract features and generate binary codes simultaneously. On the one hand, the improved model uses a deeper network and more complex network structures to enhance the ability of low-level features extraction. On the other hand, the improved model was trained by a fine-tuning strategy, which can accelerate the convergence speed. Extensive experiments on two popular multi-label datasets demonstrate that the improved model outperforms the reference models regarding accuracy. The mean average precision is improved by 1.0432 and 1.1114 times on two datasets, respectively.","PeriodicalId":213368,"journal":{"name":"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2019.8778549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In recent years, increasing deep hashing methods have been applied in large-scale multi-label image retrieval. However, in the existing deep network models, the extracted low-level features cannot effectively integrate the multi-level semantic information and the similarity ranking information of pairwise multi-label images into one hash coding learning scheme. Therefore, we cannot obtain an efficient and accurate index method. Motivated by this, in this paper, we proposed a novel approach adopting the cosine distance of pairwise multi-label images semantic vector to quantify existing multi-level similarity in a multi-label image. Meanwhile, we utilized the residual network to learn the final representation of multi-label images features. Finally, we constructed a deep hashing framework to extract features and generate binary codes simultaneously. On the one hand, the improved model uses a deeper network and more complex network structures to enhance the ability of low-level features extraction. On the other hand, the improved model was trained by a fine-tuning strategy, which can accelerate the convergence speed. Extensive experiments on two popular multi-label datasets demonstrate that the improved model outperforms the reference models regarding accuracy. The mean average precision is improved by 1.0432 and 1.1114 times on two datasets, respectively.
基于哈希层残差网络的大规模多标签图像检索
近年来,越来越多的深度哈希方法被应用于大规模多标签图像检索中。然而,在现有的深度网络模型中,提取的低级特征不能有效地将成对多标签图像的多层次语义信息和相似度排序信息整合到一个哈希编码学习方案中。因此,我们无法获得一种高效、准确的索引方法。基于此,本文提出了一种利用两两多标签图像语义向量余弦距离来量化多标签图像中存在的多层次相似度的新方法。同时,我们利用残差网络学习多标签图像特征的最终表示。最后,我们构建了一个深度哈希框架来同时提取特征和生成二进制代码。一方面,改进后的模型使用了更深层次的网络和更复杂的网络结构,增强了底层特征提取的能力。另一方面,采用微调策略对改进后的模型进行训练,提高了收敛速度。在两个流行的多标签数据集上进行的大量实验表明,改进的模型在精度方面优于参考模型。在两个数据集上,平均精度分别提高了1.0432倍和1.1114倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信