PMDFN3D: Pre-mid dual fusion network for 3D object detection

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Haishun Du , Sen Wang , Wenzhe Zhang , Linbing Cao
{"title":"PMDFN3D: Pre-mid dual fusion network for 3D object detection","authors":"Haishun Du ,&nbsp;Sen Wang ,&nbsp;Wenzhe Zhang ,&nbsp;Linbing Cao","doi":"10.1016/j.dsp.2025.105399","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, multi-modality 3D object detection technology is gradually becoming the mainstream of 3D object detection. In multi-modality 3D object detection, effectively fusing information from point cloud data and image data remains a significant challenge. Existing multi-modality 3D object detection models mainly use one of the pre-, mid- or post-fusion strategies to fuse image data and point cloud data, and each of these fusion strategies has some shortcomings. Currently, integrating multiple fusion strategies into a framework is still a research gap in the field of multi-modality 3D object detection. To fill this gap, we propose a pre-mid dual fusion network for 3D object detection (PMDFN3D), which skillfully integrates the pre-fusion and mid-fusion into a unified framework. Specifically, we first design a depth-guided cross-modality feature fusion module that enables the effective integration of image and point features without requiring complex feature alignment operations. Then, we design a neighboring feature interaction attention module to mitigate the impact of down-sampling operations in the point cloud backbone network on the precision of point features. Finally, we design a simple object-level feature selector and an object-level feature-guided cross-modality feature fusion module, which adaptively integrate image features relevant to the objects with object-level point features. Experimental results on the SUN RGB-D dataset demonstrate that our network has achieved state-of-the-art performance in 3D object detection.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"166 ","pages":"Article 105399"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S105120042500421X","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, multi-modality 3D object detection technology is gradually becoming the mainstream of 3D object detection. In multi-modality 3D object detection, effectively fusing information from point cloud data and image data remains a significant challenge. Existing multi-modality 3D object detection models mainly use one of the pre-, mid- or post-fusion strategies to fuse image data and point cloud data, and each of these fusion strategies has some shortcomings. Currently, integrating multiple fusion strategies into a framework is still a research gap in the field of multi-modality 3D object detection. To fill this gap, we propose a pre-mid dual fusion network for 3D object detection (PMDFN3D), which skillfully integrates the pre-fusion and mid-fusion into a unified framework. Specifically, we first design a depth-guided cross-modality feature fusion module that enables the effective integration of image and point features without requiring complex feature alignment operations. Then, we design a neighboring feature interaction attention module to mitigate the impact of down-sampling operations in the point cloud backbone network on the precision of point features. Finally, we design a simple object-level feature selector and an object-level feature-guided cross-modality feature fusion module, which adaptively integrate image features relevant to the objects with object-level point features. Experimental results on the SUN RGB-D dataset demonstrate that our network has achieved state-of-the-art performance in 3D object detection.
PMDFN3D:用于三维目标检测的中前双融合网络
近年来,多模态三维目标检测技术逐渐成为三维目标检测的主流。在多模态三维目标检测中,有效融合点云数据和图像数据的信息仍然是一个重大挑战。现有的多模态三维目标检测模型主要采用融合前、融合中或融合后三种策略中的一种来融合图像数据和点云数据,每种融合策略都存在一定的不足。目前,将多种融合策略集成到一个框架中,仍然是多模态三维目标检测领域的研究空白。为了填补这一空白,我们提出了一种用于3D目标检测的预中期双重融合网络(PMDFN3D),该网络巧妙地将预融合和中期融合集成到一个统一的框架中。具体而言,我们首先设计了一个深度引导的跨模态特征融合模块,该模块可以在不需要复杂的特征对齐操作的情况下有效地集成图像和点特征。然后,我们设计了一个相邻特征交互关注模块,以减轻点云骨干网下采样操作对点特征精度的影响。最后,设计了简单的目标级特征选择器和目标级特征引导下的跨模态特征融合模块,实现了与目标相关的图像特征与目标级点特征的自适应融合。在SUN RGB-D数据集上的实验结果表明,我们的网络在三维目标检测方面达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信