Visual Attention based Cognitive Informative Frame Extraction Method for Smart Crowd Surveillance

2021 IEEE Conference on Norbert Wiener in the 21st Century (21CW) Pub Date : 2021-07-22 DOI:10.1109/21CW48944.2021.9532519

Elizabeth B. Varghese, S. Thampi

{"title":"Visual Attention based Cognitive Informative Frame Extraction Method for Smart Crowd Surveillance","authors":"Elizabeth B. Varghese, S. Thampi","doi":"10.1109/21CW48944.2021.9532519","DOIUrl":null,"url":null,"abstract":"In a smart surveillance system, the amount of video data has increased exponentially due to the increase in the number of monitoring devices and IoT sensors. To make smart and real-time decisions without latency in communication from these voluminous data is a tedious task. In this context, selecting informative frames from the video is of great importance that helps to extract only the salient features for further processing without latency and bandwidth constraints. In this paper, we are proposing a fast and reliable method for selecting informative frames from video sequences based on the human cognition process of visual attention to preserve the Spatio-temporal properties of the video. The proposed method extracts the informative frames using the frame informative score calculated based on visual attention maps, superpixel segmentation, and temporal information. Since our purpose is for analyzing crowd behavior from video data in a smart environment, we take two publicly available crowd video datasets for our experiments. The results show that the proposed approach is successful in extracting relevant video frames in linear time by preserving their spatial and temporal properties. We also analyze the feasibility of the proposed method in a fog computing-based simulated IoT framework, and it has been verified that the proposed cognitive approach could efficiently address the concerns of latency and bandwidth in smart surveillance environments.","PeriodicalId":239334,"journal":{"name":"2021 IEEE Conference on Norbert Wiener in the 21st Century (21CW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Norbert Wiener in the 21st Century (21CW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/21CW48944.2021.9532519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In a smart surveillance system, the amount of video data has increased exponentially due to the increase in the number of monitoring devices and IoT sensors. To make smart and real-time decisions without latency in communication from these voluminous data is a tedious task. In this context, selecting informative frames from the video is of great importance that helps to extract only the salient features for further processing without latency and bandwidth constraints. In this paper, we are proposing a fast and reliable method for selecting informative frames from video sequences based on the human cognition process of visual attention to preserve the Spatio-temporal properties of the video. The proposed method extracts the informative frames using the frame informative score calculated based on visual attention maps, superpixel segmentation, and temporal information. Since our purpose is for analyzing crowd behavior from video data in a smart environment, we take two publicly available crowd video datasets for our experiments. The results show that the proposed approach is successful in extracting relevant video frames in linear time by preserving their spatial and temporal properties. We also analyze the feasibility of the proposed method in a fog computing-based simulated IoT framework, and it has been verified that the proposed cognitive approach could efficiently address the concerns of latency and bandwidth in smart surveillance environments.

查看原文本刊更多论文

基于视觉注意的智能人群监控认知信息框架提取方法

在智能监控系统中，由于监控设备和物联网传感器的增加，视频数据量呈指数级增长。从这些庞大的数据中做出明智的实时决策而不延迟通信是一项乏味的任务。在这种情况下，从视频中选择信息帧非常重要，这有助于在没有延迟和带宽限制的情况下仅提取显著特征进行进一步处理。在本文中，我们提出了一种基于人类视觉注意认知过程的快速可靠的视频序列信息帧选择方法，以保持视频的时空属性。该方法利用基于视觉注意图、超像素分割和时间信息计算的帧信息分数提取信息帧。由于我们的目的是在智能环境中从视频数据中分析人群行为，因此我们采用两个公开可用的人群视频数据集进行实验。结果表明，该方法在保持视频帧的时空特性的基础上，成功地提取了线性时间内的相关视频帧。我们还在基于雾计算的模拟物联网框架中分析了所提出方法的可行性，并验证了所提出的认知方法可以有效地解决智能监控环境中延迟和带宽的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Conference on Norbert Wiener in the 21st Century (21CW)

自引率

0.00%

发文量