{"title":"Ranking-based adaptive query generation for DETRs in crowded pedestrian detection","authors":"Feng Gao, Jiaxu Leng, Ji Gan, Xinbo Gao","doi":"10.1016/j.neucom.2024.128710","DOIUrl":null,"url":null,"abstract":"<div><div>Variants of DEtection TRansformer (DETRs) have shown promising performance in crowded pedestrian detection. However, we observe that DETRs are sensitive to the hyper-parameter (the number of queries). Adjusting this hyper-parameter is crucial for achieving competitive performance across different crowded pedestrian datasets. Existing query generation methods are limited to generate a fixed number of queries based on this hyper-parameter, which often leads to missed detections and incorrect detections due to the varied number and density of pedestrians in crowded scenes. To address this challenge, we propose an adaptive query generation method called Ranking-based Adaptive Query Generation (RAQG). RAQG comprises three components: a ranking prediction head, a query supplementer, and Soft Gradient L1 Loss (SGL1). Specifically, we leverage the ranking of the lowest confidence score positive training sample to generate queries adaptively. The ranking prediction head predicts this ranking, which guides our query generation. Additionally, to refine the query generation process, we introduce a query supplementer that adjusts the number of queries based on the predicted ranking. Furthermore, we introduce SGL1, a novel loss function for training the ranking prediction head over a wide regression range. Our method is designed to be lightweight and universal, suitable for integration into any DETRs framework for crowded pedestrian detection. Experimental results on Crowdhuman and Citypersons datasets demonstrate that our RAQG method can generate queries adaptively and achieves competitive results. Notably, our approach achieves a state-of-the-art 39.4% MR on Crowdhuman.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224014814","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Variants of DEtection TRansformer (DETRs) have shown promising performance in crowded pedestrian detection. However, we observe that DETRs are sensitive to the hyper-parameter (the number of queries). Adjusting this hyper-parameter is crucial for achieving competitive performance across different crowded pedestrian datasets. Existing query generation methods are limited to generate a fixed number of queries based on this hyper-parameter, which often leads to missed detections and incorrect detections due to the varied number and density of pedestrians in crowded scenes. To address this challenge, we propose an adaptive query generation method called Ranking-based Adaptive Query Generation (RAQG). RAQG comprises three components: a ranking prediction head, a query supplementer, and Soft Gradient L1 Loss (SGL1). Specifically, we leverage the ranking of the lowest confidence score positive training sample to generate queries adaptively. The ranking prediction head predicts this ranking, which guides our query generation. Additionally, to refine the query generation process, we introduce a query supplementer that adjusts the number of queries based on the predicted ranking. Furthermore, we introduce SGL1, a novel loss function for training the ranking prediction head over a wide regression range. Our method is designed to be lightweight and universal, suitable for integration into any DETRs framework for crowded pedestrian detection. Experimental results on Crowdhuman and Citypersons datasets demonstrate that our RAQG method can generate queries adaptively and achieves competitive results. Notably, our approach achieves a state-of-the-art 39.4% MR on Crowdhuman.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.