Cross-Scale Context Extracted Hashing for Fine-Grained Image Binary Encoding

Asian Conference on Machine Learning Pub Date : 2022-10-14 DOI:10.48550/arXiv.2210.07572

Xuetong Xue, Jiaying Shi, Xinxue He, Sheng Xu, Zhaoming Pan

{"title":"Cross-Scale Context Extracted Hashing for Fine-Grained Image Binary Encoding","authors":"Xuetong Xue, Jiaying Shi, Xinxue He, Sheng Xu, Zhaoming Pan","doi":"10.48550/arXiv.2210.07572","DOIUrl":null,"url":null,"abstract":"Deep hashing has been widely applied to large-scale image retrieval tasks owing to efficient computation and low storage cost by encoding high-dimensional image data into binary codes. Since binary codes do not contain as much information as float features, the essence of binary encoding is preserving the main context to guarantee retrieval quality. However, the existing hashing methods have great limitations on suppressing redundant background information and accurately encoding from Euclidean space to Hamming space by a simple sign function. In order to solve these problems, a Cross-Scale Context Extracted Hashing Network (CSCE-Net) is proposed in this paper. Firstly, we design a two-branch framework to capture fine-grained local information while maintaining high-level global semantic information. Besides, Attention guided Information Extraction module (AIE) is introduced between two branches, which suppresses areas of low context information cooperated with global sliding windows. Unlike previous methods, our CSCE-Net learns a content-related Dynamic Sign Function (DSF) to replace the original simple sign function. Therefore, the proposed CSCE-Net is context-sensitive and able to perform well on accurate image binary encoding. We further demonstrate that our CSCE-Net is superior to the existing hashing methods, which improves retrieval performance on standard benchmarks.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asian Conference on Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.07572","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Deep hashing has been widely applied to large-scale image retrieval tasks owing to efficient computation and low storage cost by encoding high-dimensional image data into binary codes. Since binary codes do not contain as much information as float features, the essence of binary encoding is preserving the main context to guarantee retrieval quality. However, the existing hashing methods have great limitations on suppressing redundant background information and accurately encoding from Euclidean space to Hamming space by a simple sign function. In order to solve these problems, a Cross-Scale Context Extracted Hashing Network (CSCE-Net) is proposed in this paper. Firstly, we design a two-branch framework to capture fine-grained local information while maintaining high-level global semantic information. Besides, Attention guided Information Extraction module (AIE) is introduced between two branches, which suppresses areas of low context information cooperated with global sliding windows. Unlike previous methods, our CSCE-Net learns a content-related Dynamic Sign Function (DSF) to replace the original simple sign function. Therefore, the proposed CSCE-Net is context-sensitive and able to perform well on accurate image binary encoding. We further demonstrate that our CSCE-Net is superior to the existing hashing methods, which improves retrieval performance on standard benchmarks.

查看原文本刊更多论文

用于细粒度图像二值编码的跨尺度上下文提取哈希

深度哈希将高维图像数据编码为二进制码，计算效率高，存储成本低，已广泛应用于大规模图像检索任务中。由于二进制编码不像浮点特征那样包含大量的信息，因此二进制编码的本质是保留主要上下文以保证检索质量。然而，现有的哈希方法在抑制冗余背景信息和通过简单的符号函数从欧几里得空间精确编码到汉明空间方面存在很大的局限性。为了解决这些问题，本文提出了一种跨尺度上下文提取哈希网络(CSCE-Net)。首先，我们设计了一个两分支框架来捕获细粒度的局部信息，同时保持高层次的全局语义信息。在两个分支之间引入注意引导信息提取模块(AIE)，与全局滑动窗口协同抑制低上下文信息区域。与以前的方法不同，我们的CSCE-Net学习了一个与内容相关的动态符号函数(DSF)来代替原来的简单符号函数。因此，本文提出的CSCE-Net具有上下文敏感性，能够对图像进行精确的二进制编码。我们进一步证明了我们的CSCE-Net优于现有的散列方法，这提高了标准基准测试的检索性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Asian Conference on Machine Learning

自引率

0.00%

发文量