SDE2D: Semantic-Guided Discriminability Enhancement Feature Detector and Descriptor

IF 8.4 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2024-12-23 DOI:10.1109/TMM.2024.3521748

Jiapeng Li;Ruonan Zhang;Ge Li;Thomas H. Li

{"title":"SDE2D: Semantic-Guided Discriminability Enhancement Feature Detector and Descriptor","authors":"Jiapeng Li;Ruonan Zhang;Ge Li;Thomas H. Li","doi":"10.1109/TMM.2024.3521748","DOIUrl":null,"url":null,"abstract":"Local feature detectors and descriptors serve various computer vision tasks, such as image matching, visual localization, and 3D reconstruction. To address the extreme variations of rotation and light in the real world, most detectors and descriptors capture as much invariance as possible. However, these methods ignore feature discriminability and perform poorly in indoor scenes. Indoor scenes have too many weak-textured and even repeatedly textured regions, so it is necessary for the extracted features to possess sufficient discriminability. Therefore, we propose a semantic-guided method (called SDE2D) enhancing feature discriminability to improve the performance of descriptors for indoor scenes. We develop a kind of semantic-guided discriminability enhancement (SDE) loss function that uses semantic information from indoor scenes. To the best of our knowledge, this is the first deep research that applies semantic segmentation to enhance discriminability. In addition, we design a novel framework that allows semantic segmentation network to be well embedded as a module in the overall framework and provides guidance information for training. Besides, we explore the impact of different semantic segmentation models on our method. The experimental results on indoor scenes datasets demonstrate that the proposed SDE2D performs well compared with the state-of-the-art models.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"275-286"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10812856/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Local feature detectors and descriptors serve various computer vision tasks, such as image matching, visual localization, and 3D reconstruction. To address the extreme variations of rotation and light in the real world, most detectors and descriptors capture as much invariance as possible. However, these methods ignore feature discriminability and perform poorly in indoor scenes. Indoor scenes have too many weak-textured and even repeatedly textured regions, so it is necessary for the extracted features to possess sufficient discriminability. Therefore, we propose a semantic-guided method (called SDE2D) enhancing feature discriminability to improve the performance of descriptors for indoor scenes. We develop a kind of semantic-guided discriminability enhancement (SDE) loss function that uses semantic information from indoor scenes. To the best of our knowledge, this is the first deep research that applies semantic segmentation to enhance discriminability. In addition, we design a novel framework that allows semantic segmentation network to be well embedded as a module in the overall framework and provides guidance information for training. Besides, we explore the impact of different semantic segmentation models on our method. The experimental results on indoor scenes datasets demonstrate that the proposed SDE2D performs well compared with the state-of-the-art models.

查看原文本刊更多论文

语义引导的可区别性增强特征检测器和描述符

局部特征检测器和描述符服务于各种计算机视觉任务，如图像匹配、视觉定位和3D重建。为了解决现实世界中旋转和光的极端变化，大多数检测器和描述符捕获尽可能多的不变性。然而，这些方法忽略了特征可判别性，在室内场景中表现不佳。室内场景有太多弱纹理甚至重复纹理的区域，因此需要提取的特征具有足够的可分辨性。因此，我们提出了一种增强特征可分辨性的语义引导方法（SDE2D）来提高描述符在室内场景中的性能。本文提出了一种基于室内场景语义信息的语义引导可判别性增强（SDE）损失函数。据我们所知，这是第一次应用语义分割来增强可辨别性的深入研究。此外，我们设计了一个新的框架，使语义分割网络作为一个模块很好地嵌入到整个框架中，并为训练提供指导信息。此外，我们还探讨了不同的语义分割模型对我们方法的影响。室内场景数据集的实验结果表明，与现有模型相比，所提出的SDE2D模型具有良好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.