基于图像的人群计数的数据标注效率研究

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI:10.1109/VCIP56404.2022.10008825

Tianfang Ma, Shuoyan Liu, Qian Wang

{"title":"基于图像的人群计数的数据标注效率研究","authors":"Tianfang Ma, Shuoyan Liu, Qian Wang","doi":"10.1109/VCIP56404.2022.10008825","DOIUrl":null,"url":null,"abstract":"Crowd counting aims at automatically estimating the number of persons in still images. It has attracted much attention due to its potential usage in surveillance, intelligent transportation and many other scenarios. In the recent decade, most researchers have been focusing on the design of novel deep learning models for improved crowd counting performance. Such attempts include proposing advanced architectures of deep neural networks, using different training strategies and loss functions. Other than the capabilities of models, the crowd counting performance is also determined by the quantity and the quality of training data. Whilst the deep models are data-hungry and better performance can usually be expected with more training data, annotating images for training is time-consuming and expensive in real-world applications. In this work, we focus on the efficiency of data annotation for crowd counting. By varying the number of annotated images and the number of annotated points (one point is annotated per person head) for training, our experimental results demonstrate it is more efficient to annotate a small number of points per image across a large number of images for training. Based on this conclusion, we present a novel adaptive scaling mechanism for data augmentation to diversify the training images without extra annotation cost. The mechanism is proved effective via thorough experiments.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"488 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"On Data Annotation Efficiency for Image Based Crowd Counting\",\"authors\":\"Tianfang Ma, Shuoyan Liu, Qian Wang\",\"doi\":\"10.1109/VCIP56404.2022.10008825\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowd counting aims at automatically estimating the number of persons in still images. It has attracted much attention due to its potential usage in surveillance, intelligent transportation and many other scenarios. In the recent decade, most researchers have been focusing on the design of novel deep learning models for improved crowd counting performance. Such attempts include proposing advanced architectures of deep neural networks, using different training strategies and loss functions. Other than the capabilities of models, the crowd counting performance is also determined by the quantity and the quality of training data. Whilst the deep models are data-hungry and better performance can usually be expected with more training data, annotating images for training is time-consuming and expensive in real-world applications. In this work, we focus on the efficiency of data annotation for crowd counting. By varying the number of annotated images and the number of annotated points (one point is annotated per person head) for training, our experimental results demonstrate it is more efficient to annotate a small number of points per image across a large number of images for training. Based on this conclusion, we present a novel adaptive scaling mechanism for data augmentation to diversify the training images without extra annotation cost. The mechanism is proved effective via thorough experiments.\",\"PeriodicalId\":269379,\"journal\":{\"name\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"488 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP56404.2022.10008825\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

人群计数的目的是自动估计静态图像中的人数。由于其在监控、智能交通和许多其他场景中的潜在应用，它引起了人们的广泛关注。近十年来，大多数研究人员都专注于设计新颖的深度学习模型来提高人群计数性能。这些尝试包括提出深度神经网络的高级架构，使用不同的训练策略和损失函数。除了模型的能力，人群计数的性能还取决于训练数据的数量和质量。虽然深度模型需要大量数据，并且通常可以通过更多的训练数据获得更好的性能，但在实际应用中，为训练图像进行注释既耗时又昂贵。在这项工作中，我们重点研究了人群计数中数据标注的效率。通过改变标注图像的数量和标注点的数量(每个人头标注一个点)进行训练，我们的实验结果表明，在大量图像中对每个图像标注少量的点进行训练是更有效的。基于这一结论，我们提出了一种新的自适应缩放机制，在不增加额外标注成本的情况下实现训练图像的多样化。经过充分的实验证明，该机制是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On Data Annotation Efficiency for Image Based Crowd Counting

Crowd counting aims at automatically estimating the number of persons in still images. It has attracted much attention due to its potential usage in surveillance, intelligent transportation and many other scenarios. In the recent decade, most researchers have been focusing on the design of novel deep learning models for improved crowd counting performance. Such attempts include proposing advanced architectures of deep neural networks, using different training strategies and loss functions. Other than the capabilities of models, the crowd counting performance is also determined by the quantity and the quality of training data. Whilst the deep models are data-hungry and better performance can usually be expected with more training data, annotating images for training is time-consuming and expensive in real-world applications. In this work, we focus on the efficiency of data annotation for crowd counting. By varying the number of annotated images and the number of annotated points (one point is annotated per person head) for training, our experimental results demonstrate it is more efficient to annotate a small number of points per image across a large number of images for training. Based on this conclusion, we present a novel adaptive scaling mechanism for data augmentation to diversify the training images without extra annotation cost. The mechanism is proved effective via thorough experiments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量