{"title":"用于行人检测的解耦可见区域网络","authors":"Lei Shi, Charles Livermore, I. Kakadiaris","doi":"10.1109/IJCB48548.2020.9304883","DOIUrl":null,"url":null,"abstract":"Pedestrian detection remains a challenging task due to the problems caused by occlusion variance. Visible-body bounding boxes are typically used as an extra supervision signal to improve the performance of pedestrian detection to predict the full-body. However, visible-body assisted approaches produce a large number of false positives, which result from a lack of adequate and discriminative full-body contextual information. In this paper, we propose a new network, dubbed DVRNet, based on the representative visible-body assisted pedestrian detector named Bi-box. Specifically, we extend Bi-box by adding three modules named the attention-based feature interleaver module (AFIM), the binary mask learning module (BMLM), and the head-aware feature enhancement module (HFEM), which play important roles in employing features learned by the visible-body and the head supervision signals to enrich high discriminative contextual information of the full-body and enhance the power of feature representation. Experimental results indicate that the DVRNet achieves promising results on the CityPersons and the CrowdHuman datasets.","PeriodicalId":417270,"journal":{"name":"2020 IEEE International Joint Conference on Biometrics (IJCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"DVRNet: Decoupled Visible Region Network for Pedestrian Detection\",\"authors\":\"Lei Shi, Charles Livermore, I. Kakadiaris\",\"doi\":\"10.1109/IJCB48548.2020.9304883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pedestrian detection remains a challenging task due to the problems caused by occlusion variance. Visible-body bounding boxes are typically used as an extra supervision signal to improve the performance of pedestrian detection to predict the full-body. However, visible-body assisted approaches produce a large number of false positives, which result from a lack of adequate and discriminative full-body contextual information. In this paper, we propose a new network, dubbed DVRNet, based on the representative visible-body assisted pedestrian detector named Bi-box. Specifically, we extend Bi-box by adding three modules named the attention-based feature interleaver module (AFIM), the binary mask learning module (BMLM), and the head-aware feature enhancement module (HFEM), which play important roles in employing features learned by the visible-body and the head supervision signals to enrich high discriminative contextual information of the full-body and enhance the power of feature representation. Experimental results indicate that the DVRNet achieves promising results on the CityPersons and the CrowdHuman datasets.\",\"PeriodicalId\":417270,\"journal\":{\"name\":\"2020 IEEE International Joint Conference on Biometrics (IJCB)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Joint Conference on Biometrics (IJCB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCB48548.2020.9304883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Joint Conference on Biometrics (IJCB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCB48548.2020.9304883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DVRNet: Decoupled Visible Region Network for Pedestrian Detection
Pedestrian detection remains a challenging task due to the problems caused by occlusion variance. Visible-body bounding boxes are typically used as an extra supervision signal to improve the performance of pedestrian detection to predict the full-body. However, visible-body assisted approaches produce a large number of false positives, which result from a lack of adequate and discriminative full-body contextual information. In this paper, we propose a new network, dubbed DVRNet, based on the representative visible-body assisted pedestrian detector named Bi-box. Specifically, we extend Bi-box by adding three modules named the attention-based feature interleaver module (AFIM), the binary mask learning module (BMLM), and the head-aware feature enhancement module (HFEM), which play important roles in employing features learned by the visible-body and the head supervision signals to enrich high discriminative contextual information of the full-body and enhance the power of feature representation. Experimental results indicate that the DVRNet achieves promising results on the CityPersons and the CrowdHuman datasets.