Revisiting Self-Similarity: Structural Embedding for Image Retrieval

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2023-06-01 DOI:10.1109/CVPR52729.2023.02242

Seongwon Lee, Suhyeon Lee, Hongje Seong, Euntai Kim

{"title":"Revisiting Self-Similarity: Structural Embedding for Image Retrieval","authors":"Seongwon Lee, Suhyeon Lee, Hongje Seong, Euntai Kim","doi":"10.1109/CVPR52729.2023.02242","DOIUrl":null,"url":null,"abstract":"Despite advances in global image representation, existing image retrieval approaches rarely consider geometric structure during the global retrieval stage. In this work, we revisit the conventional self-similarity descriptor from a convolutional perspective, to encode both the visual and structural cues of the image to global image representation. Our proposed network, named Structural Embedding Network (SENet), captures the internal structure of the images and gradually compresses them into dense self-similarity descriptors while learning diverse structures from various images. These self-similarity descriptors and original image features are fused and then pooled into global embedding, so that global embedding can represent both geometric and visual cues of the image. Along with this novel structural embedding, our proposed network sets new state-of-the-art performances on several image retrieval benchmarks, convincing its robustness to look-alike distractors. The code and models are available: https://github.com/sungonce/SENet.","PeriodicalId":376416,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR52729.2023.02242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Despite advances in global image representation, existing image retrieval approaches rarely consider geometric structure during the global retrieval stage. In this work, we revisit the conventional self-similarity descriptor from a convolutional perspective, to encode both the visual and structural cues of the image to global image representation. Our proposed network, named Structural Embedding Network (SENet), captures the internal structure of the images and gradually compresses them into dense self-similarity descriptors while learning diverse structures from various images. These self-similarity descriptors and original image features are fused and then pooled into global embedding, so that global embedding can represent both geometric and visual cues of the image. Along with this novel structural embedding, our proposed network sets new state-of-the-art performances on several image retrieval benchmarks, convincing its robustness to look-alike distractors. The code and models are available: https://github.com/sungonce/SENet.

查看原文本刊更多论文

回顾自相似:图像检索的结构嵌入

尽管在全局图像表示方面取得了进展，但现有的图像检索方法在全局检索阶段很少考虑几何结构。在这项工作中，我们从卷积的角度重新审视了传统的自相似描述符，将图像的视觉和结构线索编码为全局图像表示。我们提出的网络名为结构嵌入网络(SENet)，它捕获图像的内部结构，并逐渐将其压缩成密集的自相似描述符，同时从不同的图像中学习不同的结构。将这些自相似描述符与原始图像特征进行融合，然后将其汇集到全局嵌入中，使全局嵌入既能表示图像的几何线索，又能表示图像的视觉线索。随着这种新颖的结构嵌入，我们提出的网络在几个图像检索基准上设置了新的最先进的性能，证明了它对相似干扰物的鲁棒性。代码和模型可用:https://github.com/sungonce/SENet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量