用于RGB-D显著目标检测的双注意力引导多尺度融合网络

IF 3.4 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing-Image Communication Pub Date : 2023-10-01 DOI:10.1016/j.image.2023.117004

Huan Gao, Jichang Guo, Yudong Wang, Jianan Dong

{"title":"用于RGB-D显著目标检测的双注意力引导多尺度融合网络","authors":"Huan Gao, Jichang Guo, Yudong Wang, Jianan Dong","doi":"10.1016/j.image.2023.117004","DOIUrl":null,"url":null,"abstract":"<div><p>While recent research on salient object detection (SOD) has shown remarkable progress in leveraging both RGB and depth data, it is still worth exploring how to use the inherent relationship between the two to extract and fuse features more effectively, and further make more accurate predictions. In this paper, we consider combining the attention mechanism with the characteristics of the SOD, proposing the Dual Attention Guided Multi-scale Fusion Network. We design the multi-scale fusion block by combining multi-scale branches with channel attention to achieve better fusion of RGB and depth information. Using the characteristic of the SOD, the dual attention module is proposed to make the network pay more attention to the currently unpredicted saliency regions and the wrong parts in the already predicted regions. We perform an ablation study to verify the effectiveness of each component. Quantitative and qualitative experimental results demonstrate that our method achieves state-of-the-art (SOTA) performance.</p></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"118 ","pages":"Article 117004"},"PeriodicalIF":3.4000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual attention guided multi-scale fusion network for RGB-D salient object detection\",\"authors\":\"Huan Gao, Jichang Guo, Yudong Wang, Jianan Dong\",\"doi\":\"10.1016/j.image.2023.117004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>While recent research on salient object detection (SOD) has shown remarkable progress in leveraging both RGB and depth data, it is still worth exploring how to use the inherent relationship between the two to extract and fuse features more effectively, and further make more accurate predictions. In this paper, we consider combining the attention mechanism with the characteristics of the SOD, proposing the Dual Attention Guided Multi-scale Fusion Network. We design the multi-scale fusion block by combining multi-scale branches with channel attention to achieve better fusion of RGB and depth information. Using the characteristic of the SOD, the dual attention module is proposed to make the network pay more attention to the currently unpredicted saliency regions and the wrong parts in the already predicted regions. We perform an ablation study to verify the effectiveness of each component. Quantitative and qualitative experimental results demonstrate that our method achieves state-of-the-art (SOTA) performance.</p></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"118 \",\"pages\":\"Article 117004\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0923596523000863\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596523000863","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

尽管最近对显著对象检测（SOD）的研究在利用RGB和深度数据方面取得了显著进展，但如何利用两者之间的内在关系更有效地提取和融合特征，并进一步做出更准确的预测，仍然值得探索。本文考虑将注意力机制与超氧化物歧化酶的特点相结合，提出了双注意力引导的多尺度融合网络。我们通过将多尺度分支与通道注意力相结合来设计多尺度融合块，以实现RGB和深度信息的更好融合。利用超氧化物歧化酶的特性，提出了双注意力模块，使网络更加关注当前未预测的显著区域和已预测区域中的错误部分。我们进行了消融研究，以验证每个组件的有效性。定量和定性实验结果表明，我们的方法达到了最先进的（SOTA）性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dual attention guided multi-scale fusion network for RGB-D salient object detection

While recent research on salient object detection (SOD) has shown remarkable progress in leveraging both RGB and depth data, it is still worth exploring how to use the inherent relationship between the two to extract and fuse features more effectively, and further make more accurate predictions. In this paper, we consider combining the attention mechanism with the characteristics of the SOD, proposing the Dual Attention Guided Multi-scale Fusion Network. We design the multi-scale fusion block by combining multi-scale branches with channel attention to achieve better fusion of RGB and depth information. Using the characteristic of the SOD, the dual attention module is proposed to make the network pay more attention to the currently unpredicted saliency regions and the wrong parts in the already predicted regions. We perform an ablation study to verify the effectiveness of each component. Quantitative and qualitative experimental results demonstrate that our method achieves state-of-the-art (SOTA) performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing-Image Communication 工程技术-工程：电子与电气

CiteScore

8.40

自引率

2.90%

发文量

138

审稿时长

5.2 months

期刊介绍： Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.