Qing He, Qianqian Yang, Yinfeng Xia, Sifan Peng, B. Yin
{"title":"用于人群计数的注意引导特征融合网络","authors":"Qing He, Qianqian Yang, Yinfeng Xia, Sifan Peng, B. Yin","doi":"10.1117/12.2643005","DOIUrl":null,"url":null,"abstract":"How to solve the scale variation and background interference faced by crowd counting algorithms in practical applications is still an open problem. In this paper, to tackle the above problems, we propose the Attention-guided Feature Fusion Network (AFFNet) to learn the mapping between the crowd image and density map. In this network, the Channel-attentive Receptive Field Block (CRFB) is constructed by parallel convolutional layers with different expansion rates to extract multi-scale features. By adopting attention masks generated by high-level features to adjust low-level features, the Feature Fusion Module (FFM) can alleviate the background interference problem at the feature level. In addition, the Double Branch Module (DBM) generates a density estimation map, which further erases the background interference problem at the density level. Extensive experiments conducted on several challenging benchmark datasets including ShanghaiTech, UCF-QNRF and JHU-CROWD++ demonstrate our proposed method is superior to the state-of-the-art approaches.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Attention-guided feature fusion network for crowd counting\",\"authors\":\"Qing He, Qianqian Yang, Yinfeng Xia, Sifan Peng, B. Yin\",\"doi\":\"10.1117/12.2643005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How to solve the scale variation and background interference faced by crowd counting algorithms in practical applications is still an open problem. In this paper, to tackle the above problems, we propose the Attention-guided Feature Fusion Network (AFFNet) to learn the mapping between the crowd image and density map. In this network, the Channel-attentive Receptive Field Block (CRFB) is constructed by parallel convolutional layers with different expansion rates to extract multi-scale features. By adopting attention masks generated by high-level features to adjust low-level features, the Feature Fusion Module (FFM) can alleviate the background interference problem at the feature level. In addition, the Double Branch Module (DBM) generates a density estimation map, which further erases the background interference problem at the density level. Extensive experiments conducted on several challenging benchmark datasets including ShanghaiTech, UCF-QNRF and JHU-CROWD++ demonstrate our proposed method is superior to the state-of-the-art approaches.\",\"PeriodicalId\":314555,\"journal\":{\"name\":\"International Conference on Digital Image Processing\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Digital Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2643005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2643005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Attention-guided feature fusion network for crowd counting
How to solve the scale variation and background interference faced by crowd counting algorithms in practical applications is still an open problem. In this paper, to tackle the above problems, we propose the Attention-guided Feature Fusion Network (AFFNet) to learn the mapping between the crowd image and density map. In this network, the Channel-attentive Receptive Field Block (CRFB) is constructed by parallel convolutional layers with different expansion rates to extract multi-scale features. By adopting attention masks generated by high-level features to adjust low-level features, the Feature Fusion Module (FFM) can alleviate the background interference problem at the feature level. In addition, the Double Branch Module (DBM) generates a density estimation map, which further erases the background interference problem at the density level. Extensive experiments conducted on several challenging benchmark datasets including ShanghaiTech, UCF-QNRF and JHU-CROWD++ demonstrate our proposed method is superior to the state-of-the-art approaches.