Cross-transformer learning network for abnormal crowd human behavior detection from UAV captured images

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-09-01 DOI:10.1016/j.ipm.2025.104374

Min Zhu , Dengyin Zhang

{"title":"Cross-transformer learning network for abnormal crowd human behavior detection from UAV captured images","authors":"Min Zhu , Dengyin Zhang","doi":"10.1016/j.ipm.2025.104374","DOIUrl":null,"url":null,"abstract":"<div><div>The detection of abnormal behavior in public environments is crucial for maintaining public safety and optimizing surveillance systems. With the growing deployment of unmanned aerial vehicles (UAVs) for aerial monitoring, accurately identifying abnormal crowd behavior from UAV-captured images has become a significant challenge due to occlusions, high-density scenes, and limited spatial resolution. Traditional approaches struggle with real-time adaptability and accuracy under these complex conditions. Hence, the research proposes a Cross-Transformer Learning Network that integrates spatio-temporal attention mechanisms and dynamic boundary adaptation to enhance anomaly detection in UAV surveillance data. The novel model enables pattern boundary cross-matching and feature distributions to accurately identify behavioral anomalies across high-density and occluded environments. The model iteratively ines the learned representations until the maximum responsive pixel region is identified, effectively minimizing variations, boundary detection, and pattern extraction. The model retains critical spatial-temporal correlations across frames and improves the detection of nuanced abnormalities. Through training input correlations, precise patterns are identified for the object/human/crowd boundaries to detect abnormalities. Experiments conducted on benchmark datasets, such as UCSD and Abnormal High-Density Crowds, show that the suggested approach significantly outperforms conventional models, including ConvLSTM and Hidden Markov Models (HMM). In particular, it achieves an accuracy gain of 12.31 % and a recall increase of 13.09 %, thereby emphasizing its implementation in challenging UAV surveillance scenarios. The proposed framework addresses a crucial gap in UAV-based surveillance by offering a scalable and highly precise method for detecting abnormal human behavior in complex environments, thereby paving the way for a more responsive and intelligent public safety monitoring system.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104374"},"PeriodicalIF":6.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003152","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The detection of abnormal behavior in public environments is crucial for maintaining public safety and optimizing surveillance systems. With the growing deployment of unmanned aerial vehicles (UAVs) for aerial monitoring, accurately identifying abnormal crowd behavior from UAV-captured images has become a significant challenge due to occlusions, high-density scenes, and limited spatial resolution. Traditional approaches struggle with real-time adaptability and accuracy under these complex conditions. Hence, the research proposes a Cross-Transformer Learning Network that integrates spatio-temporal attention mechanisms and dynamic boundary adaptation to enhance anomaly detection in UAV surveillance data. The novel model enables pattern boundary cross-matching and feature distributions to accurately identify behavioral anomalies across high-density and occluded environments. The model iteratively ines the learned representations until the maximum responsive pixel region is identified, effectively minimizing variations, boundary detection, and pattern extraction. The model retains critical spatial-temporal correlations across frames and improves the detection of nuanced abnormalities. Through training input correlations, precise patterns are identified for the object/human/crowd boundaries to detect abnormalities. Experiments conducted on benchmark datasets, such as UCSD and Abnormal High-Density Crowds, show that the suggested approach significantly outperforms conventional models, including ConvLSTM and Hidden Markov Models (HMM). In particular, it achieves an accuracy gain of 12.31 % and a recall increase of 13.09 %, thereby emphasizing its implementation in challenging UAV surveillance scenarios. The proposed framework addresses a crucial gap in UAV-based surveillance by offering a scalable and highly precise method for detecting abnormal human behavior in complex environments, thereby paving the way for a more responsive and intelligent public safety monitoring system.

查看原文本刊更多论文

基于交叉变换学习网络的无人机捕获图像异常人群行为检测

公共环境中异常行为的检测对于维护公共安全和优化监控系统至关重要。随着无人机用于空中监测的部署越来越多，由于遮挡、高密度场景和有限的空间分辨率，从无人机捕获的图像中准确识别异常人群行为已成为一项重大挑战。在这些复杂的条件下，传统的方法难以达到实时适应性和准确性。为此，本研究提出了一种融合时空注意机制和动态边界自适应的交叉变形学习网络，以增强无人机监控数据的异常检测能力。该模型支持模式边界交叉匹配和特征分布，可以准确识别高密度和闭塞环境中的行为异常。该模型迭代地对学习到的表示进行线性化，直到识别出最大的响应像素区域，有效地减少了变化、边界检测和模式提取。该模型保留了跨帧的关键时空相关性，并改进了对细微异常的检测。通过训练输入相关性，可以识别物体/人/人群边界的精确模式，以检测异常情况。在UCSD和Abnormal高密度人群等基准数据集上进行的实验表明，该方法显著优于传统模型，包括ConvLSTM和隐马尔可夫模型（HMM）。特别是，它实现了12.31%的精度增益和13.09%的召回增加，从而强调了其在具有挑战性的无人机监视场景中的实现。提出的框架通过提供一种可扩展和高度精确的方法来检测复杂环境中的异常人类行为，从而为更灵敏和智能的公共安全监控系统铺平了道路，从而解决了基于无人机的监视中的一个关键空白。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.