Unsupervised video anomaly detection in UAVs: a new approach based on learning and inference

IF 2.1 Q3 ENVIRONMENTAL SCIENCES

Frontiers in Sustainable Cities Pub Date : 2023-06-07 DOI:10.3389/frsc.2023.1197434

Gang Liu, Lisheng Shu, Yuhui Yang, Chen Jin

{"title":"Unsupervised video anomaly detection in UAVs: a new approach based on learning and inference","authors":"Gang Liu, Lisheng Shu, Yuhui Yang, Chen Jin","doi":"10.3389/frsc.2023.1197434","DOIUrl":null,"url":null,"abstract":"In this paper, an innovative approach to detecting anomalous occurrences in video data without supervision is introduced, leveraging contextual data derived from visual characteristics and effectively addressing the semantic discrepancy that exists between visual information and the interpretation of atypical incidents. Our work incorporates Unmanned Aerial Vehicles (UAVs) to capture video data from a different perspective and to provide a unique set of visual features. Specifically, we put forward a technique for discerning context through scene comprehension, which entails the construction of a spatio-temporal contextual graph to represent various aspects of visual information. These aspects encompass the manifestation of objects, their interrelations within the spatio-temporal domain, and the categorization of the scenes captured by UAVs. To encode context information, we utilize Transformer with message passing for updating the graph's nodes and edges. Furthermore, we have designed a graph-oriented deep Variational Autoencoder (VAE) approach for unsupervised categorization of scenes, enabling the extraction of the spatio-temporal context graph across diverse settings. In conclusion, by utilizing contextual data, we ascertain anomaly scores at the frame-level to identify atypical occurrences. We assessed the efficacy of the suggested approach by employing it on a trio of intricate data collections, specifically, the UCF-Crime, Avenue, and ShanghaiTech datasets, which provided substantial evidence of the method's successful performance.","PeriodicalId":33686,"journal":{"name":"Frontiers in Sustainable Cities","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Sustainable Cities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsc.2023.1197434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, an innovative approach to detecting anomalous occurrences in video data without supervision is introduced, leveraging contextual data derived from visual characteristics and effectively addressing the semantic discrepancy that exists between visual information and the interpretation of atypical incidents. Our work incorporates Unmanned Aerial Vehicles (UAVs) to capture video data from a different perspective and to provide a unique set of visual features. Specifically, we put forward a technique for discerning context through scene comprehension, which entails the construction of a spatio-temporal contextual graph to represent various aspects of visual information. These aspects encompass the manifestation of objects, their interrelations within the spatio-temporal domain, and the categorization of the scenes captured by UAVs. To encode context information, we utilize Transformer with message passing for updating the graph's nodes and edges. Furthermore, we have designed a graph-oriented deep Variational Autoencoder (VAE) approach for unsupervised categorization of scenes, enabling the extraction of the spatio-temporal context graph across diverse settings. In conclusion, by utilizing contextual data, we ascertain anomaly scores at the frame-level to identify atypical occurrences. We assessed the efficacy of the suggested approach by employing it on a trio of intricate data collections, specifically, the UCF-Crime, Avenue, and ShanghaiTech datasets, which provided substantial evidence of the method's successful performance.

查看原文本刊更多论文

无人机无监督视频异常检测:一种基于学习和推理的新方法

在本文中，介绍了一种在没有监督的情况下检测视频数据中异常事件的创新方法，利用源自视觉特征的上下文数据，有效解决视觉信息与非典型事件解释之间存在的语义差异。我们的工作结合了无人机（UAV），从不同的角度捕捉视频数据，并提供一组独特的视觉特征。具体来说，我们提出了一种通过场景理解来识别上下文的技术，该技术需要构建一个时空上下文图来表示视觉信息的各个方面。这些方面包括物体的表现、它们在时空域内的相互关系以及无人机捕捉的场景的分类。为了对上下文信息进行编码，我们使用带有消息传递的Transformer来更新图的节点和边。此外，我们还设计了一种面向图的深度变分自动编码器（VAE）方法，用于场景的无监督分类，从而能够在不同的设置中提取时空上下文图。总之，通过利用上下文数据，我们确定了帧级别的异常分数，以识别非典型事件。我们通过在三个复杂的数据集上使用该方法来评估所建议方法的有效性，特别是在UCF犯罪、Avenue和ShanghaiTech数据集上，这些数据集为该方法的成功性能提供了大量证据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊