{"title":"A Component for Query-based Object Detection in Crowded Scenes","authors":"Shuo Mao","doi":"10.1145/3590003.3590039","DOIUrl":null,"url":null,"abstract":"Query-based object detection, including DETR and Sparse R-CNN, has gained considerable attention in recent years. However, in dense scenes, end-to-end object detection methods are prone to false positives. To address this issue, we propose a graph convolution-based post-processing component to refine the output results from Sparse R-CNN. Specifically, we initially select high-scoring queries to generate true positive predictions. Subsequently, the query updater refines noisy query features using GCN. Lastly, the label assignment rule matches accepted predictions to ground truth objects, eliminates matched targets, and associates noisy predictions with the remaining ground truth objects. Our method significantly enhances performance in crowded scenes. Our method achieves 92.3% AP and 41.6% on CrowdHuman dataset, which is a challenging objection detection dataset.","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3590003.3590039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Query-based object detection, including DETR and Sparse R-CNN, has gained considerable attention in recent years. However, in dense scenes, end-to-end object detection methods are prone to false positives. To address this issue, we propose a graph convolution-based post-processing component to refine the output results from Sparse R-CNN. Specifically, we initially select high-scoring queries to generate true positive predictions. Subsequently, the query updater refines noisy query features using GCN. Lastly, the label assignment rule matches accepted predictions to ground truth objects, eliminates matched targets, and associates noisy predictions with the remaining ground truth objects. Our method significantly enhances performance in crowded scenes. Our method achieves 92.3% AP and 41.6% on CrowdHuman dataset, which is a challenging objection detection dataset.