Wenxuan Li , Jian Zhou , Chi Chen , Hongkai Yu , Bo Du , Qin Zou
{"title":"BEVFix: Deep feature enhancement for robust 3D object detection","authors":"Wenxuan Li , Jian Zhou , Chi Chen , Hongkai Yu , Bo Du , Qin Zou","doi":"10.1016/j.neunet.2025.107675","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in Bird’s Eye View (BEV)-based 3D object detection have highlighted its potential to enhance scene understanding in autonomous driving applications. However, existing BEV-based methods utilizing point clouds for 3D object detection face significant challenges due to inherent sparsity and noise, which often compromise the accuracy of BEV representations. Furthermore, in multimodal 3D object detection, the lack of depth information in images can lead to distortions in the image BEV features generated through view transformations, further leading to inaccuracies in the fused BEV representation. To overcome these limitations, we introduce BEVFix, an innovative end-to-end 3D object detection method designed to refine BEV representations. BEVFix starts by generating a mask based on the point cloud distribution to identify specific regions requiring repair. This is followed by our WaveRefiner, which employs Discrete Wavelet Transform (DWT) for multi-frequency decomposition and utilizes a Feed-Forward Network (FFN) to isolate noise while selectively retaining critical features. These components work synergistically to reduce noise and enhance BEV representations. Experiments on the nuScenes and Waymo datasets demonstrate that BEVFix significantly improves performance, achieving state-of-the-art results. The source code will be publicly available at <span><span>https://github.com/WenxuanLi-whu/Co-Fix3d</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107675"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025005556","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recent advancements in Bird’s Eye View (BEV)-based 3D object detection have highlighted its potential to enhance scene understanding in autonomous driving applications. However, existing BEV-based methods utilizing point clouds for 3D object detection face significant challenges due to inherent sparsity and noise, which often compromise the accuracy of BEV representations. Furthermore, in multimodal 3D object detection, the lack of depth information in images can lead to distortions in the image BEV features generated through view transformations, further leading to inaccuracies in the fused BEV representation. To overcome these limitations, we introduce BEVFix, an innovative end-to-end 3D object detection method designed to refine BEV representations. BEVFix starts by generating a mask based on the point cloud distribution to identify specific regions requiring repair. This is followed by our WaveRefiner, which employs Discrete Wavelet Transform (DWT) for multi-frequency decomposition and utilizes a Feed-Forward Network (FFN) to isolate noise while selectively retaining critical features. These components work synergistically to reduce noise and enhance BEV representations. Experiments on the nuScenes and Waymo datasets demonstrate that BEVFix significantly improves performance, achieving state-of-the-art results. The source code will be publicly available at https://github.com/WenxuanLi-whu/Co-Fix3d.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.