Improving RGB-D salient object detection by addressing inconsistent saliency problems

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-05-28 DOI:10.1016/j.knosys.2024.111996

Kun Zuo , Hanguang Xiao , Hongmin Zhang , Diya Chen , Tianqi Liu , Yulin Li , Hao Wen

{"title":"Improving RGB-D salient object detection by addressing inconsistent saliency problems","authors":"Kun Zuo , Hanguang Xiao , Hongmin Zhang , Diya Chen , Tianqi Liu , Yulin Li , Hao Wen","doi":"10.1016/j.knosys.2024.111996","DOIUrl":null,"url":null,"abstract":"<div><p>RGB-D salient object detection (SOD) models based on a two-stream structure have achieved good performance in single-object scenes. In multi-object scenes, there is an inconsistent saliency problem between RGB modality and depth modality, which deteriorates the accuracy of subsequent fusion results. Inconsistent saliency is caused by the following issues: firstly, artifacts, missing depth values, and confusion in depth maps render depth modality unreliable, leading to increased reliance on RGB modality for results. Secondly, RGB modality and depth modality lack guidance in salient object detection. Thirdly, there is a lack of interaction between modalities. To address these issues, we first propose a depth recovery (DR) block to mitigate the negative effects of both the original and estimated depth maps. Next, we design the saliency detection (SD) block, which effectively guides each modality to focus on salient objects using semantic information. Meanwhile, SD combines multi-scale information to enhance the ability to detect multi-scale objects in each modality. Finally, a specific fusion block (SFB) is designed to fuse salient object information obtained from RGB and depth modalities. Quantitative and qualitative experiments demonstrate that our method achieves state-of-the-art (SOTA) performance among 10 methods.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"299 ","pages":"Article 111996"},"PeriodicalIF":7.6000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124006300","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

RGB-D salient object detection (SOD) models based on a two-stream structure have achieved good performance in single-object scenes. In multi-object scenes, there is an inconsistent saliency problem between RGB modality and depth modality, which deteriorates the accuracy of subsequent fusion results. Inconsistent saliency is caused by the following issues: firstly, artifacts, missing depth values, and confusion in depth maps render depth modality unreliable, leading to increased reliance on RGB modality for results. Secondly, RGB modality and depth modality lack guidance in salient object detection. Thirdly, there is a lack of interaction between modalities. To address these issues, we first propose a depth recovery (DR) block to mitigate the negative effects of both the original and estimated depth maps. Next, we design the saliency detection (SD) block, which effectively guides each modality to focus on salient objects using semantic information. Meanwhile, SD combines multi-scale information to enhance the ability to detect multi-scale objects in each modality. Finally, a specific fusion block (SFB) is designed to fuse salient object information obtained from RGB and depth modalities. Quantitative and qualitative experiments demonstrate that our method achieves state-of-the-art (SOTA) performance among 10 methods.

查看原文本刊更多论文

通过解决不一致的突出问题改进 RGB-D 突出物体检测

基于双流结构的 RGB-D 突出物体检测（SOD）模型在单物体场景中取得了良好的性能。在多物体场景中，RGB 模式和深度模式之间存在不一致的突出度问题，这会降低后续融合结果的准确性。不一致的显著性是由以下问题造成的：首先，深度图中的伪影、深度值缺失和混淆会使深度模式变得不可靠，从而增加对 RGB 模式结果的依赖。其次，RGB 模式和深度模式在突出物体检测方面缺乏指导。第三，模式之间缺乏互动。为了解决这些问题，我们首先提出了深度恢复（DR）模块，以减轻原始深度图和估计深度图的负面影响。接着，我们设计了突出检测（SD）模块，利用语义信息有效地引导每种模态聚焦于突出对象。同时，SD 结合了多尺度信息，增强了在每种模态中检测多尺度物体的能力。最后，还设计了一个特定的融合块（SFB），用于融合从 RGB 和深度模式获得的突出物体信息。定量和定性实验证明，在 10 种方法中，我们的方法达到了最先进（SOTA）的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.