Tingyu Zhang;Zhigang Liang;Yanzhao Yang;Xinyu Yang;Yu Zhu;Jian Wang
{"title":"Contrastive Late Fusion for 3D Object Detection","authors":"Tingyu Zhang;Zhigang Liang;Yanzhao Yang;Xinyu Yang;Yu Zhu;Jian Wang","doi":"10.1109/TIV.2024.3454085","DOIUrl":null,"url":null,"abstract":"In the field of autonomous driving, accurate and efficient 3D object detection is crucial for ensuring safe and reliable operation. This paper focuses on the fusion of camera and LiDAR data in a late-fusion manner for 3D object detection. The proposed approach incorporates contrastive learning to enhance feature consistency between camera and LiDAR candidates, which is named as Contrastive Camera-LiDAR Object Candidates (C-CLOCs) fusion network, facilitating better fusion results. We delve into the label assignment aspect in late fusion methods and introduce a novel label assignment strategy to filter out irrelevant information. Additionally, a Multi-modality Ground-truth Sampling (MGS) method is introduced, which leverages the inclusion of point cloud information from LiDAR and corresponding images in training samples, resulting in improved performance. Experimental results demonstrate the effectiveness of the proposed method in achieving accurate 3D object detection in autonomous driving scenarios.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"10 5","pages":"3442-3457"},"PeriodicalIF":14.3000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10663866/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the field of autonomous driving, accurate and efficient 3D object detection is crucial for ensuring safe and reliable operation. This paper focuses on the fusion of camera and LiDAR data in a late-fusion manner for 3D object detection. The proposed approach incorporates contrastive learning to enhance feature consistency between camera and LiDAR candidates, which is named as Contrastive Camera-LiDAR Object Candidates (C-CLOCs) fusion network, facilitating better fusion results. We delve into the label assignment aspect in late fusion methods and introduce a novel label assignment strategy to filter out irrelevant information. Additionally, a Multi-modality Ground-truth Sampling (MGS) method is introduced, which leverages the inclusion of point cloud information from LiDAR and corresponding images in training samples, resulting in improved performance. Experimental results demonstrate the effectiveness of the proposed method in achieving accurate 3D object detection in autonomous driving scenarios.
期刊介绍:
The IEEE Transactions on Intelligent Vehicles (T-IV) is a premier platform for publishing peer-reviewed articles that present innovative research concepts, application results, significant theoretical findings, and application case studies in the field of intelligent vehicles. With a particular emphasis on automated vehicles within roadway environments, T-IV aims to raise awareness of pressing research and application challenges.
Our focus is on providing critical information to the intelligent vehicle community, serving as a dissemination vehicle for IEEE ITS Society members and others interested in learning about the state-of-the-art developments and progress in research and applications related to intelligent vehicles. Join us in advancing knowledge and innovation in this dynamic field.