Adaptive Density Map Generation for Crowd Counting

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI:10.1109/ICCV.2019.00122

Jia Wan, Antoni B. Chan

{"title":"Adaptive Density Map Generation for Crowd Counting","authors":"Jia Wan, Antoni B. Chan","doi":"10.1109/ICCV.2019.00122","DOIUrl":null,"url":null,"abstract":"Crowd counting is an important topic in computer vision due to its practical usage in surveillance systems. The typical design of crowd counting algorithms is divided into two steps. First, the ground-truth density maps of crowd images are generated from the ground-truth dot maps (density map generation), e.g., by convolving with a Gaussian kernel. Second, deep learning models are designed to predict a density map from an input image (density map estimation). Most research efforts have concentrated on the density map estimation problem, while the problem of density map generation has not been adequately explored. In particular, the density map could be considered as an intermediate representation used to train a crowd counting network. In the sense of end-to-end training, the hand-crafted methods used for generating the density maps may not be optimal for the particular network or dataset used. To address this issue, we first show the impact of different density maps and that better ground-truth density maps can be obtained by refining the existing ones using a learned refinement network, which is jointly trained with the counter. Then, we propose an adaptive density map generator, which takes the annotation dot map as input, and learns a density map representation for a counter. The counter and generator are trained jointly within an end-to-end framework. The experiment results on popular counting datasets confirm the effectiveness of the proposed learnable density map representations.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"167 1","pages":"1130-1139"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2019.00122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 132

Abstract

Crowd counting is an important topic in computer vision due to its practical usage in surveillance systems. The typical design of crowd counting algorithms is divided into two steps. First, the ground-truth density maps of crowd images are generated from the ground-truth dot maps (density map generation), e.g., by convolving with a Gaussian kernel. Second, deep learning models are designed to predict a density map from an input image (density map estimation). Most research efforts have concentrated on the density map estimation problem, while the problem of density map generation has not been adequately explored. In particular, the density map could be considered as an intermediate representation used to train a crowd counting network. In the sense of end-to-end training, the hand-crafted methods used for generating the density maps may not be optimal for the particular network or dataset used. To address this issue, we first show the impact of different density maps and that better ground-truth density maps can be obtained by refining the existing ones using a learned refinement network, which is jointly trained with the counter. Then, we propose an adaptive density map generator, which takes the annotation dot map as input, and learns a density map representation for a counter. The counter and generator are trained jointly within an end-to-end framework. The experiment results on popular counting datasets confirm the effectiveness of the proposed learnable density map representations.

查看原文本刊更多论文

用于人群计数的自适应密度图生成

由于人群计数在监控系统中的实际应用，它是计算机视觉中的一个重要课题。人群计数算法的典型设计分为两个步骤。首先，从真实点图(密度图生成)生成人群图像的真实密度图，例如，通过与高斯核卷积。其次，深度学习模型被设计用来从输入图像中预测密度图(密度图估计)。大多数研究都集中在密度图估计问题上，而密度图的生成问题尚未得到充分的探讨。特别是，密度图可以被认为是用于训练人群计数网络的中间表示。在端到端训练的意义上，用于生成密度图的手工方法可能不是所使用的特定网络或数据集的最佳方法。为了解决这个问题，我们首先展示了不同密度图的影响，并且通过使用与计数器联合训练的学习改进网络对现有的地面真密度图进行改进，可以获得更好的地面真密度图。然后，我们提出了一种自适应密度图生成器，它以标注点图作为输入，并学习密度图表示为计数器。计数器和生成器在端到端框架内联合训练。在常用计数数据集上的实验结果证实了所提出的可学习密度图表示的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量