CASSANDRA: audio-video sensor fusion for aggression detection

2007 IEEE Conference on Advanced Video and Signal Based Surveillance Pub Date : 2007-09-05 DOI:10.1109/AVSS.2007.4425310

W. Zajdel, J. D. Krijnders, T. Andringa, D. Gavrila

引用次数: 113

Abstract

This paper presents a smart surveillance system named CASSANDRA, aimed at detecting instances of aggressive human behavior in public environments. A distinguishing aspect of CASSANDRA is the exploitation of the complimentary nature of audio and video sensing to disambiguate scene activity in real-life, noisy and dynamic environments. At the lower level, independent analysis of the audio and video streams yields intermediate descriptors of a scene like: "scream", "passing train" or "articulation energy". At the higher level, a Dynamic Bayesian Network is used as a fusion mechanism that produces an aggregate aggression indication for the current scene. Our prototype system is validated on a set of scenarios performed by professional actors at an actual train station to ensure a realistic audio and video noise setting.

查看原文本刊更多论文

用于攻击检测的音频-视频传感器融合

本文提出了一种名为CASSANDRA的智能监控系统，旨在检测公共环境中人类攻击行为的实例。CASSANDRA的一个与众不同的方面是利用音频和视频传感的互补特性来消除现实生活中嘈杂和动态环境中的场景活动。在较低的层次上，对音频和视频流的独立分析产生了场景的中间描述符，如“尖叫”、“经过的火车”或“发音能量”。在更高的层次上，动态贝叶斯网络被用作一种融合机制，为当前场景产生汇总攻击指示。我们的原型系统在一组由专业演员在实际火车站表演的场景中进行了验证，以确保真实的音频和视频噪音设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IEEE Conference on Advanced Video and Signal Based Surveillance

自引率

0.00%

发文量