基于cnn的训练数据自动标注人群结构分析

M. S. Zitouni, A. Sluzek, H. Bhaskar
{"title":"基于cnn的训练数据自动标注人群结构分析","authors":"M. S. Zitouni, A. Sluzek, H. Bhaskar","doi":"10.1109/AVSS.2019.8909846","DOIUrl":null,"url":null,"abstract":"A CNN-based framework is presented for extracting and classifying from static images of crowd (acquired from surveillance systems) individuals, small groups and large groups. A novel approach to the network training has been investigated. Instead of manually outlined ground-truth data, we use automatic annotations by alternative baseline algorithms (which consider both motion and appearance). The proposed CNN detectors are initially trained over rather limited amounts of data. Nevertheless, the detectors are subsequently updated (fine-tuned) by using new batches of automatically annotated samples. Those test samples are periodically acquired by the baseline algorithms from the future surveillance data. Fine-tuning is performed when noticeable differences appear between results by the CNN-detectors and the results of baseline algorithms (which may indicate changes in visual conditions, scenarios or updates in the baseline algorithms). We preliminarily demonstrate that satisfactory performances of CNN-based detectors can be achieved, even if the baseline algorithms have limited accuracy. Actually, it was noticed that fine-tuned CNN-detectors can be superior to the baseline algorithms used for automatic annotation of training data (even though the baseline algorithms process both static images and video-sequences). Since only static images are used once the detectors are fully trained, the presented solution can simplify complexity of systems automatically evaluating structure and behavior of crowds.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"CNN-Based Analysis of Crowd Structure using Automatically Annotated Training Data\",\"authors\":\"M. S. Zitouni, A. Sluzek, H. Bhaskar\",\"doi\":\"10.1109/AVSS.2019.8909846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A CNN-based framework is presented for extracting and classifying from static images of crowd (acquired from surveillance systems) individuals, small groups and large groups. A novel approach to the network training has been investigated. Instead of manually outlined ground-truth data, we use automatic annotations by alternative baseline algorithms (which consider both motion and appearance). The proposed CNN detectors are initially trained over rather limited amounts of data. Nevertheless, the detectors are subsequently updated (fine-tuned) by using new batches of automatically annotated samples. Those test samples are periodically acquired by the baseline algorithms from the future surveillance data. Fine-tuning is performed when noticeable differences appear between results by the CNN-detectors and the results of baseline algorithms (which may indicate changes in visual conditions, scenarios or updates in the baseline algorithms). We preliminarily demonstrate that satisfactory performances of CNN-based detectors can be achieved, even if the baseline algorithms have limited accuracy. Actually, it was noticed that fine-tuned CNN-detectors can be superior to the baseline algorithms used for automatic annotation of training data (even though the baseline algorithms process both static images and video-sequences). Since only static images are used once the detectors are fully trained, the presented solution can simplify complexity of systems automatically evaluating structure and behavior of crowds.\",\"PeriodicalId\":243194,\"journal\":{\"name\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2019.8909846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

提出了一种基于cnn的框架,用于从人群(从监控系统获取)的静态图像中提取和分类个体、小群体和大群体。研究了一种新的网络训练方法。我们没有手动勾勒出真实的数据,而是使用替代基线算法(同时考虑运动和外观)的自动注释。提出的CNN检测器最初是在相当有限的数据量上进行训练的。然而,检测器随后通过使用新批次的自动注释样本进行更新(微调)。这些测试样本由基线算法定期从未来的监测数据中获取。当cnn检测器的结果与基线算法的结果之间出现明显差异时(这可能表明视觉条件、场景或基线算法的更新发生了变化),就会进行微调。我们初步证明,即使基线算法的精度有限,基于cnn的检测器也可以获得令人满意的性能。实际上,经过微调的cnn检测器可以优于用于自动标注训练数据的基线算法(即使基线算法同时处理静态图像和视频序列)。由于检测器完全训练后只使用静态图像,因此所提出的解决方案可以简化系统自动评估人群结构和行为的复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CNN-Based Analysis of Crowd Structure using Automatically Annotated Training Data
A CNN-based framework is presented for extracting and classifying from static images of crowd (acquired from surveillance systems) individuals, small groups and large groups. A novel approach to the network training has been investigated. Instead of manually outlined ground-truth data, we use automatic annotations by alternative baseline algorithms (which consider both motion and appearance). The proposed CNN detectors are initially trained over rather limited amounts of data. Nevertheless, the detectors are subsequently updated (fine-tuned) by using new batches of automatically annotated samples. Those test samples are periodically acquired by the baseline algorithms from the future surveillance data. Fine-tuning is performed when noticeable differences appear between results by the CNN-detectors and the results of baseline algorithms (which may indicate changes in visual conditions, scenarios or updates in the baseline algorithms). We preliminarily demonstrate that satisfactory performances of CNN-based detectors can be achieved, even if the baseline algorithms have limited accuracy. Actually, it was noticed that fine-tuned CNN-detectors can be superior to the baseline algorithms used for automatic annotation of training data (even though the baseline algorithms process both static images and video-sequences). Since only static images are used once the detectors are fully trained, the presented solution can simplify complexity of systems automatically evaluating structure and behavior of crowds.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信