Temporal Extension for Encoder-Decoder-based Crowd Counting Approaches

2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI:10.23919/MVA51890.2021.9511351

T. Golda, F. Krüger, J. Beyerer

引用次数: 0

Abstract

Crowd counting is an important aspect to safety monitoring at mass events and can be used to initiate safety measures in time. State-of-the-art encoder-decoder architectures are able to estimate the number of people in a scene precisely. However, since most of the proposed methods are based to solely operate on single-image features, we observe that estimated counts for aerial video sequences are inherently noisy, which in turn reduces the significance of the overall estimates. In this paper, we propose a simple temporal extension to said encoder-decoder architectures that incorporates local context from multiple frames into the estimation process. By applying the temporal extension a state-of-the-art architectures and exploring multiple configuration settings, we find that the resulting estimates are more precise and smoother over time.

查看原文本刊更多论文

基于编码器-解码器的人群计数方法的时间扩展

人群统计是大型活动安全监测的重要方面，可以及时启动安全措施。最先进的编码器-解码器架构能够精确地估计场景中的人数。然而，由于大多数提出的方法仅基于单图像特征，我们观察到航空视频序列的估计计数固有地带有噪声，这反过来降低了总体估计的重要性。在本文中，我们提出了对上述编码器-解码器架构的简单时间扩展，该架构将来自多个帧的本地上下文合并到估计过程中。通过在最先进的体系结构中应用时间扩展并探索多个配置设置，我们发现随着时间的推移，所得到的估计更加精确和平滑。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 17th International Conference on Machine Vision and Applications (MVA)

自引率

0.00%

发文量