{"title":"Few-shot Object Counting and Detection with Query-Guided Attention","authors":"Yuhao Lin","doi":"10.1145/3603781.3603865","DOIUrl":null,"url":null,"abstract":"The focus of this paper is on Few-Shot Counting and Detection (FSCD), a task that involves counting and localizing target objects based on a few exemplar bounding boxes. In particular, we address two major challenges in developing a FSCD model: the high cost of bounding box labeling and the large variations in object appearance. To mitigate the former issue, we propose a neighbor distance-aware mechanism for generating pseudo bounding boxes. This mechanism utilizes neighboring objects as context to estimate the location and size of the target object without requiring training. To address the challenge of appearance variation, we introduce a novel query-guided attention module that enhances the visual features of the search image by employing multi-head cross attention with query features. The module is designed to encourage attentive inspection of the search image by directing the model to focus more on regions that share similarities with the target objects. We integrate the query-guided attention module into the Faster-RCNN object detection model, resulting in a new few-shot object detector named Counting-RCNN. The proposed approach outperforms the state-of-the-art method on a large-scale FSCD147 dataset, achieving 0.60 MAE, 5.36 RMSE, and 13.01% AP50 improvement.","PeriodicalId":391180,"journal":{"name":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3603781.3603865","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The focus of this paper is on Few-Shot Counting and Detection (FSCD), a task that involves counting and localizing target objects based on a few exemplar bounding boxes. In particular, we address two major challenges in developing a FSCD model: the high cost of bounding box labeling and the large variations in object appearance. To mitigate the former issue, we propose a neighbor distance-aware mechanism for generating pseudo bounding boxes. This mechanism utilizes neighboring objects as context to estimate the location and size of the target object without requiring training. To address the challenge of appearance variation, we introduce a novel query-guided attention module that enhances the visual features of the search image by employing multi-head cross attention with query features. The module is designed to encourage attentive inspection of the search image by directing the model to focus more on regions that share similarities with the target objects. We integrate the query-guided attention module into the Faster-RCNN object detection model, resulting in a new few-shot object detector named Counting-RCNN. The proposed approach outperforms the state-of-the-art method on a large-scale FSCD147 dataset, achieving 0.60 MAE, 5.36 RMSE, and 13.01% AP50 improvement.