基于常识线索的无监督3D物体检测。

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI:10.1109/TPAMI.2025.3598341

Hai Wu, Shijia Zhao, Xun Huang, Qiming Xia, Chenglu Wen, Li Jiang, Xin Li, Cheng Wang

{"title":"基于常识线索的无监督3D物体检测。","authors":"Hai Wu, Shijia Zhao, Xun Huang, Qiming Xia, Chenglu Wen, Li Jiang, Xin Li, Cheng Wang","doi":"10.1109/TPAMI.2025.3598341","DOIUrl":null,"url":null,"abstract":"Traditional 3D object detectors, whether fully-, semi-, or weakly-supervised, rely heavily on extensive human annotations. In contrast, this paper introduces an unsupervised 3D object detector that automatically discerns object patterns without such annotations. To achieve this, we propose a Commonsense Prototype-based Detector (CPD) for unsupervised 3D object detection. CPD first constructs Commonsense Prototypes (CProto) to represent the geometric center and size of objects. It then generates high-quality pseudo-labels and guides detector convergence using size and geometry priors from CProto. Building on CPD, we further introduce CPD++, an enhanced version that improves performance by leveraging motion cues. CPD++ learns localization from stationary objects and recognition from moving objects, facilitating the mutual transfer of localization and recognition knowledge between these two object types. Both CPD and CPD++ outperform existing state-of-the-art unsupervised 3D detectors. Furthermore, when trained on Waymo Open Dataset (WOD) and tested on KITTI, CPD++ achieves 89.25% 3D Average Precision (AP) on the moderate car class at a 0.5 IoU threshold, reaching 95.3% of the performance attained by fully supervised counterparts. These results underscore the significant advancements brought by our method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised 3D Object Detection by Commonsense Clue.\",\"authors\":\"Hai Wu, Shijia Zhao, Xun Huang, Qiming Xia, Chenglu Wen, Li Jiang, Xin Li, Cheng Wang\",\"doi\":\"10.1109/TPAMI.2025.3598341\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional 3D object detectors, whether fully-, semi-, or weakly-supervised, rely heavily on extensive human annotations. In contrast, this paper introduces an unsupervised 3D object detector that automatically discerns object patterns without such annotations. To achieve this, we propose a Commonsense Prototype-based Detector (CPD) for unsupervised 3D object detection. CPD first constructs Commonsense Prototypes (CProto) to represent the geometric center and size of objects. It then generates high-quality pseudo-labels and guides detector convergence using size and geometry priors from CProto. Building on CPD, we further introduce CPD++, an enhanced version that improves performance by leveraging motion cues. CPD++ learns localization from stationary objects and recognition from moving objects, facilitating the mutual transfer of localization and recognition knowledge between these two object types. Both CPD and CPD++ outperform existing state-of-the-art unsupervised 3D detectors. Furthermore, when trained on Waymo Open Dataset (WOD) and tested on KITTI, CPD++ achieves 89.25% 3D Average Precision (AP) on the moderate car class at a 0.5 IoU threshold, reaching 95.3% of the performance attained by fully supervised counterparts. These results underscore the significant advancements brought by our method.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TPAMI.2025.3598341\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3598341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

传统的3D对象检测器，无论是完全监督、半监督还是弱监督，都严重依赖于大量的人工注释。相比之下，本文介绍了一种无监督的3D物体检测器，它可以自动识别物体模式，而不需要这些注释。为了实现这一目标，我们提出了一种基于常识原型的检测器（CPD），用于无监督的3D物体检测。CPD首先构建常识原型（Commonsense Prototypes, CProto）来表示物体的几何中心和大小。然后，它生成高质量的伪标签，并使用CProto的大小和几何先验引导检测器收敛。在CPD的基础上，我们进一步介绍cpd++，这是一个通过利用运动线索提高性能的增强版本。cpd++从静止的物体中学习定位，从运动的物体中学习识别，促进了这两种物体类型之间定位和识别知识的相互传递。CPD和cpd++都优于现有的最先进的无监督3D探测器。此外，当在Waymo开放数据集（WOD）上进行训练并在KITTI上进行测试时，cppd++在0.5 IoU的阈值下，在中等汽车类别上达到89.25%的3D平均精度（AP），达到完全监督的同行所达到的95.3%。这些结果强调了我们的方法带来的重大进步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unsupervised 3D Object Detection by Commonsense Clue.

Traditional 3D object detectors, whether fully-, semi-, or weakly-supervised, rely heavily on extensive human annotations. In contrast, this paper introduces an unsupervised 3D object detector that automatically discerns object patterns without such annotations. To achieve this, we propose a Commonsense Prototype-based Detector (CPD) for unsupervised 3D object detection. CPD first constructs Commonsense Prototypes (CProto) to represent the geometric center and size of objects. It then generates high-quality pseudo-labels and guides detector convergence using size and geometry priors from CProto. Building on CPD, we further introduce CPD++, an enhanced version that improves performance by leveraging motion cues. CPD++ learns localization from stationary objects and recognition from moving objects, facilitating the mutual transfer of localization and recognition knowledge between these two object types. Both CPD and CPD++ outperform existing state-of-the-art unsupervised 3D detectors. Furthermore, when trained on Waymo Open Dataset (WOD) and tested on KITTI, CPD++ achieves 89.25% 3D Average Precision (AP) on the moderate car class at a 0.5 IoU threshold, reaching 95.3% of the performance attained by fully supervised counterparts. These results underscore the significant advancements brought by our method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量