{"title":"A Similarity Metric for Multimodal Images Based on Modified Hausdorff Distance","authors":"Yong Li, R. Stevenson","doi":"10.1109/AVSS.2012.3","DOIUrl":"https://doi.org/10.1109/AVSS.2012.3","url":null,"abstract":"This paper presents a similarity metric on multimodal images utilizing curves as comparing primitives. Curves are detected from images, and then junctions are detected along curves and used to partition curves into subcurves. A modified Hausdorff distance is applied to determine whether a test subcurve is matched to a reference curve. The similarity metric is defined to be the number of matched curves. The number of overlapped edge pixels between two images is also defined on the basis of matched curves, which does not require accurately localizing edge pixels. The partitioning scheme avoids addresing curve partial matching and allows for test subcurves being matched to a reference curve if they correspond to each other. Experimental results show that the presented similarity metric gives more robust and reliable results, especially under noise.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125008334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-scale Fusion of Texture and Color for Background Modeling","authors":"Zhong Zhang, Chunheng Wang, Baihua Xiao, Shuang Liu, Wen Zhou","doi":"10.1109/AVSS.2012.48","DOIUrl":"https://doi.org/10.1109/AVSS.2012.48","url":null,"abstract":"Background modeling from a stationary camera is a crucial component in video surveillance. Traditional methods usually adopt single feature type to solve the problem, while the performance is usually unsatisfactory when handling complex scenes. In this paper, we propose a multi-scale strategy, which combines both texture and color features, to achieve a robust and accurate solution. Our contributions are two folds: one is that we propose a novel texture operator named Scale-invariant Center-symmetric Local Ternary Pattern, which is robust to noise and illumination variations, the other is that a multi-scale fusion strategy is proposed for the issue. Our method is verified on several complex real world videos with illumination variation, soft shadows and dynamic backgrounds. We compare our method with four state-of-the-art methods, and the experimental results clearly demonstrate that our method achieves the highest classification accuracy in complex real world videos.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123327500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conrad Sanderson, M. Harandi, Yongkang Wong, B. Lovell
{"title":"Combined Learning of Salient Local Descriptors and Distance Metrics for Image Set Face Verification","authors":"Conrad Sanderson, M. Harandi, Yongkang Wong, B. Lovell","doi":"10.1109/AVSS.2012.23","DOIUrl":"https://doi.org/10.1109/AVSS.2012.23","url":null,"abstract":"In contrast to comparing faces via single exemplars, matching sets of face images increases robustness and discrimination performance. Recent image set matching approaches typically measure similarities between subspaces or manifolds, while representing faces in a rigid and holistic manner. Such representations are easily affected by variations in terms of alignment, illumination, pose and expression. While local feature based representations are considerably more robust to such variations, they have received little attention within the image set matching area. We propose a novel image set matching technique, comprised of three aspects: (i) robust descriptors of face regions based on local features, partly inspired by the hierarchy in the human visual system, (ii) use of several subspace and exemplar metrics to compare corresponding face regions, (iii) jointly learning which regions are the most discriminative while finding the optimal mixing weights for combining metrics. Experiments on LFW, PIE and MOBIO face datasets show that the proposed algorithm obtains considerably better performance than several recent state of-the-art techniques, such as Local Principal Angle and the Kernel Affine Hull Method.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127641213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dae-Gyun Kim, Younghyun Lee, Bonhwa Ku, Hanseok Ko
{"title":"Crowd Density Estimation Using Multi-class Adaboost","authors":"Dae-Gyun Kim, Younghyun Lee, Bonhwa Ku, Hanseok Ko","doi":"10.1109/AVSS.2012.31","DOIUrl":"https://doi.org/10.1109/AVSS.2012.31","url":null,"abstract":"In this paper, we propose a crowd density estimation algorithm based on multi-class Adaboost using spectral texture features. Conventional methods based on self-organizing maps have shown unsatisfactory performance in practical scenarios, and in particular, they have exhibited abrupt degradation in performance under special conditions of crowd densities. In order to address these problems, we have developed a new training strategy by incorporating multi-class Adaboost with spectral texture features that represent a global texture pattern. According to the representative experimental results, the proposed method shows an average improvement of about 30% in the correct recognition rate, as compared to existing conventional methods.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131989627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting People Carrying Objects Utilizing Lagrangian Dynamics","authors":"T. Senst, A. Kuhn, H. Theisel, T. Sikora","doi":"10.1109/AVSS.2012.34","DOIUrl":"https://doi.org/10.1109/AVSS.2012.34","url":null,"abstract":"The availability of dense motion information in computer vision domain allows for the effective application of Lagrangian techniques that have their origin in fluid flow analysis and dynamical systems theory. A well established technique that has been proven to be useful in image-based crowd analysis are Finite Time Lyapunov Exponents (FTLE). Based on this, we present a method to detect people carrying object and describe a methodology how to apply established flow field methods onto the problem of describing individuals. Further, we reinterpret Lagrangian features in relation to the underlying motion process and show their applicability towards the appearance modeling of pedestrians. This definition allows to increase performance of state-of-the-art methods and is shown to be robust under varying parameter settings and different optical flow extraction approaches.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134560714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Abnormal Object Detection Using Feedforward Model and Sequential Filters","authors":"Jiman Kim, Bong-Nam Kang, Hai Wang, Daijin Kim","doi":"10.1109/AVSS.2012.5","DOIUrl":"https://doi.org/10.1109/AVSS.2012.5","url":null,"abstract":"Abnormal object detection and discrimisnation is a critical research area for vision-based surveillance systems. This paper proposes a novel algorithm for the detection and discrimination of abnormal objects, such as abandoned and stolen objects. The proposed algorithm consists of three stages and three different filters. The three stages cooperate with each other using the feedforward model to enhance detection and discrimination performance, while the sequential filters efficiently reject falsely detected regions using three categories of information. The results of experiments conducted using public datasets indicate that the proposed algorithm is more accurate and has a lower false alarm ratio than the existing system.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133890190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking Blurred Object with Data-Driven Tracker","authors":"Jianwei Ding, Kaiqi Huang, T. Tan","doi":"10.1109/AVSS.2012.78","DOIUrl":"https://doi.org/10.1109/AVSS.2012.78","url":null,"abstract":"Motion blur is very common in the low quality of image sequences and videos captured by low speed of cameras. Object tracking without accounting for the motion blur would easily fail in these kinds of videos. We propose a new data-driven tracker in the particle filter framework to address this problem without deblurring the image sequences. The motion blur is detected by exploring the property of the blurred input image through Fourier analysis. The appearance model is integrated with a set of motion blur kernels which could reflect different blur effects in real scenes. The motion model is improved to be more robust to sudden motion of the target object. To evaluate the proposed algorithm, several challenging videos with significant motion blur are used in the experiments. The experimental results demonstrate the robustness and accuracy of our algorithm.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130761723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Baseline Results for Violence Detection in Still Images","authors":"Dong Wang, Z. Zhang, Wei Wang, Liang Wang, T. Tan","doi":"10.1109/AVSS.2012.16","DOIUrl":"https://doi.org/10.1109/AVSS.2012.16","url":null,"abstract":"Recognizing objectionable content draws more and more attention nowadays given the rapid proliferation of images and videos on the Internet. Although there are some investigations about violence video detection and pornographic information filtering, very few existing methods touch on the problem of violence detection in still images. However, given its potential use in violence webpage filtering, online public opinion monitoring and some other aspects, recognizing violence in still images is worth being deeply investigated. To this end, we first establish a new database containing 500 violence images and 1500 non-violence images. And we use the Bag-of-Words (BoW) model which is frequently adopted in image classification domain to discriminate violence images and non-violence images. The effectiveness of four different feature representations are tested within the BoW framework. Finally the baseline results for violence image detection on our newly built database are reported.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132998026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Longyin Wen, Zhaowei Cai, Menglong Yang, Zhen Lei, Dong Yi, S. Li
{"title":"Online Multiple Instance Joint Model for Visual Tracking","authors":"Longyin Wen, Zhaowei Cai, Menglong Yang, Zhen Lei, Dong Yi, S. Li","doi":"10.1109/AVSS.2012.52","DOIUrl":"https://doi.org/10.1109/AVSS.2012.52","url":null,"abstract":"Although numerous online learning strategies have been proposed to handle the appearance variation in visual tracking, the existing methods just perform well in certain cases since they lack effective appearance learning mechanism. In this paper, a joint model tracker (JMT) is presented, which consists of a generative model based on Multiple Subspaces and a discriminative model based on improved Multiple Instance Boosting (MIBoosting). The generative model utilizes a series of local constructed subspaces to update the Multiple Subspaces model and considers the energy dissipation of dimension reduction in updating step. The discriminative model adopts the Gaussian Mixture Model (GMM) to estimate the posterior probability of the likelihood function. These two parts supervise each other to update in multiple instance way which helps our tracker recover from drift. Extensive experiments on various databases validate the effectiveness of our proposed method over other state-of-the-art trackers.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117299803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining Neural Networks and Fuzzy Systems for Human Behavior Understanding","authors":"G. Acampora, P. Foggia, Alessia Saggese, M. Vento","doi":"10.1109/AVSS.2012.25","DOIUrl":"https://doi.org/10.1109/AVSS.2012.25","url":null,"abstract":"The psychological overcharge issue related to human inadequacy to maintain a constant level of attention in simultaneously monitoring multiple visual information sources makes necessary to develop enhanced video surveillance systems that automatically understand human behaviors and identify dangerous situations. This paper introduces a semantic human behavioral analysis (HBA) system based on a neuro-fuzzy approach that, independently from the specific application, translates tracking kinematic data into a collection of semantic labels characterizing the behavior of different actors in a scene in order to appropriately classify the current situation. Different from other HBA approaches, the proposed system shows high level of scalability, robustness and tolerance for tracking imprecision and, for this reason, it could represent a valid choice for improving the performance of current systems.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"504 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120864589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}