{"title":"Improve the Scale Invariance of the Convolutional Network for Crowd Counting","authors":"Ryan Jin","doi":"10.1109/ICCECE51280.2021.9342331","DOIUrl":null,"url":null,"abstract":"The main challenges of crowd counting are considerable variations in complex scenes/backgrounds. This paper first reveals that the Convolution Neural Networks (CNNs) are incapable of addressing these problems. To solve this problem, we propose a novel attention mechanism to improve the scale invariance of convolutional networks. Our method can not only automatically exploit spatial awareness to optimize the convolutional features but also imitate the human attention mechanism to remove the noise of the background. It is worth noting that it can easily plug-and-play into the vanilla convolution/pooling layer with relatively little computation cost. We have integrated our method into several state-of-the-art methods. Extensive experiments on five popular benchmarks demonstrate that our approach significantly outperforms other state-of-the-art methods and beats entire convolution/pooling layer in all cases.","PeriodicalId":229425,"journal":{"name":"2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE51280.2021.9342331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The main challenges of crowd counting are considerable variations in complex scenes/backgrounds. This paper first reveals that the Convolution Neural Networks (CNNs) are incapable of addressing these problems. To solve this problem, we propose a novel attention mechanism to improve the scale invariance of convolutional networks. Our method can not only automatically exploit spatial awareness to optimize the convolutional features but also imitate the human attention mechanism to remove the noise of the background. It is worth noting that it can easily plug-and-play into the vanilla convolution/pooling layer with relatively little computation cost. We have integrated our method into several state-of-the-art methods. Extensive experiments on five popular benchmarks demonstrate that our approach significantly outperforms other state-of-the-art methods and beats entire convolution/pooling layer in all cases.