Dual attention guided multi-scale fusion network for RGB-D salient object detection

IF 3.4 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Huan Gao, Jichang Guo, Yudong Wang, Jianan Dong
{"title":"Dual attention guided multi-scale fusion network for RGB-D salient object detection","authors":"Huan Gao,&nbsp;Jichang Guo,&nbsp;Yudong Wang,&nbsp;Jianan Dong","doi":"10.1016/j.image.2023.117004","DOIUrl":null,"url":null,"abstract":"<div><p>While recent research on salient object detection (SOD) has shown remarkable progress in leveraging both RGB and depth data, it is still worth exploring how to use the inherent relationship between the two to extract and fuse features more effectively, and further make more accurate predictions. In this paper, we consider combining the attention mechanism with the characteristics of the SOD, proposing the Dual Attention Guided Multi-scale Fusion Network. We design the multi-scale fusion block by combining multi-scale branches with channel attention to achieve better fusion of RGB and depth information. Using the characteristic of the SOD, the dual attention module is proposed to make the network pay more attention to the currently unpredicted saliency regions and the wrong parts in the already predicted regions. We perform an ablation study to verify the effectiveness of each component. Quantitative and qualitative experimental results demonstrate that our method achieves state-of-the-art (SOTA) performance.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"118 ","pages":"Article 117004"},"PeriodicalIF":3.4000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596523000863","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

While recent research on salient object detection (SOD) has shown remarkable progress in leveraging both RGB and depth data, it is still worth exploring how to use the inherent relationship between the two to extract and fuse features more effectively, and further make more accurate predictions. In this paper, we consider combining the attention mechanism with the characteristics of the SOD, proposing the Dual Attention Guided Multi-scale Fusion Network. We design the multi-scale fusion block by combining multi-scale branches with channel attention to achieve better fusion of RGB and depth information. Using the characteristic of the SOD, the dual attention module is proposed to make the network pay more attention to the currently unpredicted saliency regions and the wrong parts in the already predicted regions. We perform an ablation study to verify the effectiveness of each component. Quantitative and qualitative experimental results demonstrate that our method achieves state-of-the-art (SOTA) performance.

用于RGB-D显著目标检测的双注意力引导多尺度融合网络
尽管最近对显著对象检测(SOD)的研究在利用RGB和深度数据方面取得了显著进展,但如何利用两者之间的内在关系更有效地提取和融合特征,并进一步做出更准确的预测,仍然值得探索。本文考虑将注意力机制与超氧化物歧化酶的特点相结合,提出了双注意力引导的多尺度融合网络。我们通过将多尺度分支与通道注意力相结合来设计多尺度融合块,以实现RGB和深度信息的更好融合。利用超氧化物歧化酶的特性,提出了双注意力模块,使网络更加关注当前未预测的显著区域和已预测区域中的错误部分。我们进行了消融研究,以验证每个组件的有效性。定量和定性实验结果表明,我们的方法达到了最先进的(SOTA)性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Signal Processing-Image Communication
Signal Processing-Image Communication 工程技术-工程:电子与电气
CiteScore
8.40
自引率
2.90%
发文量
138
审稿时长
5.2 months
期刊介绍: Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信