Classification and Temporal Localization of Robbery Events in CCTV Videos through Multi-Stream Deep Networks

Zakia Yahya, M. M. Ullah
{"title":"Classification and Temporal Localization of Robbery Events in CCTV Videos through Multi-Stream Deep Networks","authors":"Zakia Yahya, M. M. Ullah","doi":"10.1109/HONET.2019.8908040","DOIUrl":null,"url":null,"abstract":"Robbery is an open social problem. Towards tackling this problem, we in this paper propose multi-stream deep networks for the classification as well as temporal localization of robbery events in CCTV videos. In our multi-stream architecture, each stream is comprised of a pre-trained 3D ConvNet in combination with LSTM which is followed by softmax. In particular, we investigate three streams based on three different types of input: (a) RGB data, (b) optical flows, and (c) foreground masks. Each stream is trained independently, and the final scores are averaged for predictions.To test the approach, we compile a robbery dataset from YouTube, which contains 124 untrimmed CCTV videos. Empirical comparison with several state-of-the-art methods demonstrate the promise of our multi-stream model in both the classification as well as temporal localization tasks.","PeriodicalId":291738,"journal":{"name":"2019 IEEE 16th International Conference on Smart Cities: Improving Quality of Life Using ICT & IoT and AI (HONET-ICT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 16th International Conference on Smart Cities: Improving Quality of Life Using ICT & IoT and AI (HONET-ICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HONET.2019.8908040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Robbery is an open social problem. Towards tackling this problem, we in this paper propose multi-stream deep networks for the classification as well as temporal localization of robbery events in CCTV videos. In our multi-stream architecture, each stream is comprised of a pre-trained 3D ConvNet in combination with LSTM which is followed by softmax. In particular, we investigate three streams based on three different types of input: (a) RGB data, (b) optical flows, and (c) foreground masks. Each stream is trained independently, and the final scores are averaged for predictions.To test the approach, we compile a robbery dataset from YouTube, which contains 124 untrimmed CCTV videos. Empirical comparison with several state-of-the-art methods demonstrate the promise of our multi-stream model in both the classification as well as temporal localization tasks.
基于多流深度网络的CCTV视频抢劫事件分类与时间定位
抢劫是一个公开的社会问题。为了解决这一问题,本文提出了多流深度网络对CCTV视频中的抢劫事件进行分类和时间定位。在我们的多流架构中,每个流由预训练的3D ConvNet与LSTM结合组成,然后是softmax。特别是,我们研究了基于三种不同类型输入的三种流:(a) RGB数据,(b)光流和(c)前景掩模。每个流都是独立训练的,并对最终分数进行平均预测。为了测试这种方法,我们编译了一个来自YouTube的抢劫数据集,其中包含124个未修剪的CCTV视频。与几种最先进的方法进行实证比较,证明了我们的多流模型在分类和时间定位任务中的前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信