{"title":"Multiple Instance Learning for CNN Based Fire Detection and Localization","authors":"M. Aktas, Ali Bayramcavus, Toygar Akgun","doi":"10.1109/AVSS.2019.8909842","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909842","url":null,"abstract":"Motivated by the state-of-the-art performance achieved by convolutional neural networks (CNN) in visual detection and classification tasks, CNNs have recently been applied to the visual fire detection problem. In this work, we extend the existing CNN based approaches to fire detection in video sequences by incorporating Multiple Instance Learning (MIL). MIL relaxes the requirement of having accurate locations of fire patches in video frames, which are needed for patch level CNN training. Instead, only frame level labels indicating the presence of fire somewhere in a video frame are needed, substantially alleviating the annotation and training efforts. The resulting approach is tested on a new fire dataset obtained by extending some of the previously used fire datasets with video sequences collected from the web. Experimental results show that the proposed method improves fire detection performance upto 2.5%, while providing patch level localization without requiring patch level annotations.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128564435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maik Simon, Markus Küchhold, T. Senst, Erik Bochinski, T. Sikora
{"title":"Video-based Bottleneck Detection utilizing Lagrangian Dynamics in Crowded Scenes","authors":"Maik Simon, Markus Küchhold, T. Senst, Erik Bochinski, T. Sikora","doi":"10.1109/AVSS.2019.8909861","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909861","url":null,"abstract":"Avoiding bottleneck situations in crowds is critical for the safety and comfort of people at large events or in public transportation. Based on the work of Lagrangian motion analysis we propose a novel video-based bottleneck-detector by identifying characteristic stowage patterns in crowd-movements captured by optical flow fields. The Lagrangian framework allows to assess complex time-dependent crowd-motion dynamics at large temporal scales near the bottleneck by two dimensional Lagrangian fields. In particular we propose long-term temporal filtered Finite Time Lyapunov Exponents (FTLE) fields that provide towards a more global segmentation of the crowd movements and allows to capture its deformations when a crowd is passing a bottleneck. Finally, these deformations are used for an automatic spatio-temporal detection of such situations. The performance of the proposed approach is shown in extensive evaluations on the existing Jülich and AGO-RASET datasets, that we have updated with ground truth data for spatio-temporal bottleneck analysis.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"45 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116645936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What goes around comes around: Cycle-Consistency-based Short-Term Motion Prediction for Anomaly Detection using Generative Adversarial Networks","authors":"T. Golda, Nils Murzyn, Chengchao Qu, K. Kroschel","doi":"10.1109/AVSS.2019.8909853","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909853","url":null,"abstract":"Anomaly detection plays in many fields of research, along with the strongly related task of outlier detection, a very important role. Especially within the context of the automated analysis of video material recorded by surveillance cameras, abnormal situations can be of very different nature. For this purpose this work investigates Generative-Adversarial-Network-based methods (GAN) for anomaly detection related to surveillance applications. The focus is on the usage of static camera setups, since this kind of camera is one of the most often used and belongs to the lower price segment. In order to address this task, multiple subtasks are evaluated, including the influence of existing optical flow methods for the incorporation of short-term temporal information, different forms of network setups and losses for GANs, and the use of morphological operations for further performance improvement. With these extension we achieved up to 2.4% better results. Furthermore, the final method reduced the anomaly detection error for GAN based methods by about 42.8%.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124089715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Caetano, Jessica Sena, F. Brémond, J. A. D. Santos, W. R. Schwartz
{"title":"SkeleMotion: A New Representation of Skeleton Joint Sequences based on Motion Information for 3D Action Recognition","authors":"C. Caetano, Jessica Sena, F. Brémond, J. A. D. Santos, W. R. Schwartz","doi":"10.1109/AVSS.2019.8909840","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909840","url":null,"abstract":"Due to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have focused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture long-range joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122246929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Golda, Tobias Kalb, Arne Schumann, Jürgen Beyerer
{"title":"Human Pose Estimation for Real-World Crowded Scenarios","authors":"T. Golda, Tobias Kalb, Arne Schumann, Jürgen Beyerer","doi":"10.1109/AVSS.2019.8909823","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909823","url":null,"abstract":"Human pose estimation has recently made significant progress with the adoption of deep convolutional neural networks and many applications have attracted tremendous interest in recent years. However, many of these applications require pose estimation for human crowds, which still is a rarely addressed problem. For this purpose this work explores methods to optimize pose estimation for human crowds, focusing on challenges introduced with larger scale crowds like people in close proximity to each other, mutual occlusions, and partial visibility of people due to the environment. In order to address these challenges, multiple approaches are evaluated including: the explicit detection of occluded body parts, a data augmentation method to generate occlusions and the use of the synthetic generated dataset JTA [3]. In order to overcome the transfer gap of JTA originating from a low pose variety and less dense crowds, an extension dataset is created to ease the use for real-world applications.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132132333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giorgos Bouritsas, Stelios Daveas, A. Danelakis, C. Rizogiannis, S. Thomopoulos
{"title":"Automated Real-time Anomaly Detection in Human Trajectories using Sequence to Sequence Networks","authors":"Giorgos Bouritsas, Stelios Daveas, A. Danelakis, C. Rizogiannis, S. Thomopoulos","doi":"10.1109/AVSS.2019.8909844","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909844","url":null,"abstract":"Detection of anomalous trajectories is an important problem with potential applications to various domains, such as video surveillance, risk assessment, vessel monitoring and high-energy physics. Modeling the distribution of trajectories with statistical approaches has been a challenging task due to the fact that such time series are usually non stationary and highly dimensional. However, modern machine learning techniques provide robust approaches for data-driven modeling and critical information extraction. In this paper, we propose a Sequence to Sequence architecture for real-time detection of anomalies in human trajectories, in the context of risk-based security. Our detection scheme is tested on a synthetic dataset of diverse and realistic trajectories generated by the ISL iCrowd simulator [11], [12]. The experimental results indicate that our scheme accurately detects motions that deviate from normal behaviors and is promising for future real-world applications.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115658583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TrackNet: A Deep Learning Network for Tracking High-speed and Tiny Objects in Sports Applications*","authors":"Yu-Chuan Huang, I-No Liao, Ching-Hsuan Chen, Tsì-Uí İk, Wen-Chih Peng","doi":"10.1109/AVSS.2019.8909871","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909871","url":null,"abstract":"Ball trajectory data are one of the most fundamental and useful information in the evaluation of players' performance and analysis of game strategies. It is still challenging to recognize and position a high-speed and tiny ball accurately from an ordinary video. In this paper, we develop a deep learning network, called TrackNet, to track the tennis ball from broadcast videos in which the ball images are small, blurry, and sometimes with afterimage tracks or even invisible. The proposed heatmap-based deep learning network is trained to not only recognize the ball image from a single frame but also learn flying patterns from consecutive frames. The network is evaluated on the video of the men's singles final at the 2017 Summer Universiade, which is available on YouTube. The precision, recall, and $F1$ -measure reach 99.7%, 97.3%, and 98.5%, respectively. To prevent overfitting, 9 additional videos are partially labeled together with a subset from the previous dataset to implement 10-fold cross-validation, and the precision, recall, and $F_{1}$ -measure are 95.3%, 75.7%, and 84.3%, respectively. The source code and dataset are available at https://nol.cs.nctu.edu.tw:234/open-source/TrackNet/.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114273664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inverse Attention Guided Deep Crowd Counting Network","authors":"Vishwanath A. Sindagi, Vishal M. Patel","doi":"10.1109/AVSS.2019.8909889","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909889","url":null,"abstract":"In this paper, we address the challenging problem of crowd counting in congested scenes. Specifically, we present Inverse Attention Guided Deep Crowd Counting Network (IA-DCCN) that efficiently infuses segmentation information through an inverse attention mechanism into the counting network, resulting in significant improvements. The proposed method, which is based on VGG-16, is a single-step training framework and is simple to implement. The use of segmentation information does not require additional annotation efforts. We demonstrate the significance of segmentation guided inverse attention through a detailed analysis and ablation study. Furthermore, the proposed method is evaluated on three challenging crowd counting datasets and is shown to achieve significant improvements over several recent methods.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125902585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi-Lun Pan, Min-Jhih Haung, Kuo-Teng Ding, Ja-Ling Wu, J. Jang
{"title":"K-Same-Siamese-GAN: K-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training","authors":"Yi-Lun Pan, Min-Jhih Haung, Kuo-Teng Ding, Ja-Ling Wu, J. Jang","doi":"10.1109/AVSS.2019.8909866","DOIUrl":"https://doi.org/10.1109/AVSS.2019.8909866","url":null,"abstract":"For a data holder, such as a hospital or a government entity, who has a privately held collection of personal data, in which the revealing and/or processing of the personal identifiable data is restricted and prohibited by law. Then, “how can we ensure the data holder does conceal the identity of each individual in the imagery of personal data while still preserving certain useful aspects of the data after de-identification?” becomes a challenge issue. In this work, we propose an approach towards high-resolution facial image de-identification, called k-Same-Siamese-GAN, which leverages the k-Same-Anonymity mechanism, the Generative Adversarial Network, and the hyperparameter tuning methods. Moreover, to speed up model training and reduce memory consumption, the mixed precision training technique is also applied to make kSS-GAN provide guarantees regarding privacy protection on close-form identities and be trained much more efficiently as well. Finally, to validate its applicability, the proposed work has been applied to actual datasets - RafD and CelebA for performance testing. Besides protecting privacy of high-resolution facial images, the proposed system is also Justified for its ability in automating parameter tuning and breaking through the limitation of the number of adjustable parameters.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115880813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}