Edge-Enhanced TempoFuseNet: A Two-Stream Framework for Intelligent Multiclass Video Anomaly Recognition in 5G and IoT Environments

Future Internet Pub Date : 2024-02-29 DOI:10.3390/fi16030083

Gulshan Saleem, U. I. Bajwa, R. H. Raza, Fan Zhang

{"title":"Edge-Enhanced TempoFuseNet: A Two-Stream Framework for Intelligent Multiclass Video Anomaly Recognition in 5G and IoT Environments","authors":"Gulshan Saleem, U. I. Bajwa, R. H. Raza, Fan Zhang","doi":"10.3390/fi16030083","DOIUrl":null,"url":null,"abstract":"Surveillance video analytics encounters unprecedented challenges in 5G and IoT environments, including complex intra-class variations, short-term and long-term temporal dynamics, and variable video quality. This study introduces Edge-Enhanced TempoFuseNet, a cutting-edge framework that strategically reduces spatial resolution to allow the processing of low-resolution images. A dual upscaling methodology based on bicubic interpolation and an encoder–bank–decoder configuration is used for anomaly classification. The two-stream architecture combines the power of a pre-trained Convolutional Neural Network (CNN) for spatial feature extraction from RGB imagery in the spatial stream, while the temporal stream focuses on learning short-term temporal characteristics, reducing the computational burden of optical flow. To analyze long-term temporal patterns, the extracted features from both streams are combined and routed through a Gated Recurrent Unit (GRU) layer. The proposed framework (TempoFuseNet) outperforms the encoder–bank–decoder model in terms of performance metrics, achieving a multiclass macro average accuracy of 92.28%, an F1-score of 69.29%, and a false positive rate of 4.41%. This study presents a significant advancement in the field of video anomaly recognition and provides a comprehensive solution to the complex challenges posed by real-world surveillance scenarios in the context of 5G and IoT.","PeriodicalId":509567,"journal":{"name":"Future Internet","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi16030083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Surveillance video analytics encounters unprecedented challenges in 5G and IoT environments, including complex intra-class variations, short-term and long-term temporal dynamics, and variable video quality. This study introduces Edge-Enhanced TempoFuseNet, a cutting-edge framework that strategically reduces spatial resolution to allow the processing of low-resolution images. A dual upscaling methodology based on bicubic interpolation and an encoder–bank–decoder configuration is used for anomaly classification. The two-stream architecture combines the power of a pre-trained Convolutional Neural Network (CNN) for spatial feature extraction from RGB imagery in the spatial stream, while the temporal stream focuses on learning short-term temporal characteristics, reducing the computational burden of optical flow. To analyze long-term temporal patterns, the extracted features from both streams are combined and routed through a Gated Recurrent Unit (GRU) layer. The proposed framework (TempoFuseNet) outperforms the encoder–bank–decoder model in terms of performance metrics, achieving a multiclass macro average accuracy of 92.28%, an F1-score of 69.29%, and a false positive rate of 4.41%. This study presents a significant advancement in the field of video anomaly recognition and provides a comprehensive solution to the complex challenges posed by real-world surveillance scenarios in the context of 5G and IoT.

查看原文本刊更多论文

边缘增强的 TempoFuseNet：用于 5G 和物联网环境中智能多类视频异常识别的双流框架

在 5G 和物联网环境中，监控视频分析遇到了前所未有的挑战，包括复杂的类内变化、短期和长期时间动态以及可变的视频质量。本研究介绍了边缘增强 TempoFuseNet，这是一种前沿框架，可战略性地降低空间分辨率，以便处理低分辨率图像。基于双三次插值和编码器-库-解码器配置的双重升频方法被用于异常分类。双流架构结合了预先训练好的卷积神经网络（CNN）的强大功能，用于从空间流中的 RGB 图像中提取空间特征，而时间流则侧重于学习短期时间特征，从而减轻光流的计算负担。为了分析长期的时间模式，将从两个流中提取的特征进行合并，并通过一个门控递归单元（GRU）层进行路由。所提出的框架（TempoFuseNet）在性能指标方面优于编码器-库-解码器模型，其多类宏观平均准确率达到 92.28%，F1 分数达到 69.29%，误报率为 4.41%。这项研究在视频异常识别领域取得了重大进展，为应对 5G 和物联网背景下真实世界监控场景带来的复杂挑战提供了全面的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Internet

自引率

0.00%

发文量