{"title":"Video Representation with Dynamic Features from Multi-Frame Frame- Difference Images","authors":"M. Lee, Alexander Lee, D. Lee, Soo-Young Lee","doi":"10.1109/WMVC.2007.38","DOIUrl":"https://doi.org/10.1109/WMVC.2007.38","url":null,"abstract":"The extraction of dynamic motion features are reported from multiple video frames by three unsupervised learning algorithms, i.e., Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. Both the original videos and the frame-difference sequences are used for comparison. The required number of multiframe features for the same representation accuracy is almost independent upon the frame length. Therefore, the multiple-frame features are much more efficient for video representation than the single-frame static features. The extracted features are also used for lipreading, and the features from frame-difference sequences demonstrated better recognition rates than those from original videos.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128310124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Irregularities in Human Actions with Volumetric Motion History Images","authors":"A. Albu, T. Beugeling, N. Virji-Babul, C. Beach","doi":"10.1109/WMVC.2007.8","DOIUrl":"https://doi.org/10.1109/WMVC.2007.8","url":null,"abstract":"This paper describes a new 3D motion representation, the Volumetric Motion History Image (VMHI), to be used for the analysis of irregularities in human actions. Such irregularities may occur either in speed or orientation and are strong indicators of the balance abilities and of the confidence level of the subject performing the activity. The proposed VMHI representation overcomes limits of the standard MHI related to motion self-occlusion and speed and is therefore suitable for the visualization and quantification of abnormal motion. This work focuses on the analysis of sway, which is the most common motion irregularity in the studied set of human actions. The sway is visualized and quantified via a user interface using a measure of spatiotemporal surface smoothness, namely the deviation vector. Experimental results show that the deviation vector is a reliable measure for quantifying the deviation of abnormal motion from its corresponding normal motion.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130936611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Activity Recognition using Dynamic Bayesian Networks with Automatic State Selection","authors":"J. Muncaster, Yunqian Ma","doi":"10.1109/WMVC.2007.5","DOIUrl":"https://doi.org/10.1109/WMVC.2007.5","url":null,"abstract":"Applying advanced video technology to understand activity and intent is becoming increasingly important for intelligent video surveillance. We present a general model of a d-level dynamic Bayesian network to perform complex event recognition. The levels of the network are constrained to enforce state hierarchy while the dth level models the duration of simplest event. Moreover, in this paper we propose to use the deterministic annealing clustering method to automatically discover the states for the observable levels. We used real world data sets to show the effectiveness of our proposed method.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132235974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Practical Camera Auto Calibration using Semidefinite Programming","authors":"M. Agrawal","doi":"10.1109/WMVC.2007.39","DOIUrl":"https://doi.org/10.1109/WMVC.2007.39","url":null,"abstract":"We describe a novel approach to the camera auto calibration problem. The uncalibrated camera is first moved in a static scene and feature points are matched across frames to obtain the feature tracks. Mismatches in these tracks are identified by computing the fundamental matrices between adjacent frames. The inlier feature tracks are then used to obtain a projective structure and motion of the camera using iterative perspective factorization scheme. The novelty of our approach lies in the application of semidefinite programming for recovering the camera focal lengths and the principal point. Semidefinite programming was used in our earlier work [1] to recover focal lengths under the assumption of known principal points. In this paper, we relax the constraint of known principal point and do an exhaustive search for the principal points. Moreover, we describe an end-to-end system for auto calibration and present experimental results to evaluate our approach.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132660706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Alternative Formulation for Five Point Relative Pose Problem","authors":"Dhruv Batra, Bart C. Nabbe, M. Hebert","doi":"10.1109/WMVC.2007.6","DOIUrl":"https://doi.org/10.1109/WMVC.2007.6","url":null,"abstract":"The \"Five Point Relative Pose Problem\" is to find all possible camera configurations between two calibrated views of a scene given five point-correspondences. We take a fresh look at this well-studied problem with an emphasis on the parametrization of Essential Matrices used by various methods over the years. Using one of these parametrizations, a novel algorithm is proposed, in which the solution to the problem is encoded in a system of nine quadratic equations in six variables, and is reached by formulating this as a constrained optimization problem. We compare our algorithm with an existing 5-point method, and show our formulation to be more robust in the presence of noise.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"34 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114020605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and Tracking of Moving Vehicles in Crowded Scenes","authors":"Xuefeng Song, R. Nevatia","doi":"10.1109/WMVC.2007.13","DOIUrl":"https://doi.org/10.1109/WMVC.2007.13","url":null,"abstract":"Vehicle inter-occlusion is a significant problem for multiplevehicle tracking even with a static camera. The difficulty is that the one-to-one correspondence between foreground blobs and vehicles does not hold when multiple vehicle blobs are merged in the scene. Making use of camera and vehicle model constraints, we propose a MCMCbased method to segment multiple merged vehicles into individual vehicles with their respective orientation. Then a Viterbi algorithm is applied to search through the sequence for the optimal tracks. Our method automatically detects and tracks multiple vehicles with orientation changes and prevalent occlusion, without requiring a special region to initialize each vehicle individually. Tests are performed on video sequences from busy street intersections and show very promising results.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116716331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking vehicle targets with large aspect change","authors":"Behzad Jamasbi, S. Motamedi, A. Behrad","doi":"10.1109/WMVC.2007.37","DOIUrl":"https://doi.org/10.1109/WMVC.2007.37","url":null,"abstract":"In this paper we present a novel method for tracking rigid objects (mostly vehicles) in video sequences with cluttered background, obtained from a mobile camera. We estimate the motion model of the target using corners extraction and matching together with LMedS statistical approach. The resultant model assists an active contour (snake) to track the target efficiently. To avoid tracking errors resulting from large aspect change of the target, we propose a snake with an automatic local swelling mechanism. This mechanism enables the snake to include new parts of the target which appear as a result of large aspect change. Several experiments have been conducted to show the promise of our algorithms. Key words: contour tracking , moving target , aspect change , active contour , feature matching","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127943934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Map-Enhanced Detection and Tracking from a Moving Platform with Local and Global Data Association","authors":"Qian Yu, G. Medioni","doi":"10.1109/WMVC.2007.23","DOIUrl":"https://doi.org/10.1109/WMVC.2007.23","url":null,"abstract":"We present an approach to detect and track moving objects from a moving platform. Moreover, given a global map, such as a satellite image, our approach can locate and track the targets in geo-coordinates, namely longitude and latitude. The map information is used as a global constraint for compensating the camera motion, which is critical for motion detection on a moving platform. In addition, by projecting the targets¿ position to a global map, tracking is performed in coordinates with physical meaning and thus the motion model is more meaningful than tracking in image coordinate. In a real scenario, targets can leave the field of view or be occluded. Thus we address tracking as a data association problem at the local and global levels. At the local level, the moving image blobs, provided from the motion detection, are associated into tracklets by a MCMC (Markov Chain Monte Carlo) Data Association algorithm. Both motion and appearance likelihood are considered when local data association is performed. Then, at the global level, tracklets are linked by their appearance and spatio-temporal consistence on the global map. Experiments show that our method can deal with long term occlusion and segmented tracks even when targets leave the field of view.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133274338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object-Based Spatial Segmentation of Video Guided by Depth and Motion Information","authors":"Jaime S. Cardoso, Jorge S. Cardoso, L. Côrte-Real","doi":"10.1109/WMVC.2007.31","DOIUrl":"https://doi.org/10.1109/WMVC.2007.31","url":null,"abstract":"Automatic spatial video segmentation is a problem without a general solution at the current state-of-the-art. Most of the difficulties arise from the process of capturing images, which remain a very limited sample of the scene they represent. The capture of additional information, in the form of depth data, is a step forward to address this problem. We start by investigating the use of depth data for better image segmentation; a novel segmentation framework is proposed, with depth being mainly used to guide a segmentation algorithm on the colour information. Then, we extend the method to also incorporate motion information in the segmentation process. The effectiveness and simplicity of the proposed method is documented with results on a selected set of images sequences. The achieved quality raises the expectation for a significant improvement on operations relying on spatial video segmentation as a pre-process.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115435667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coupled Hidden Semi Markov Models for Activity Recognition","authors":"P. Natarajan, R. Nevatia","doi":"10.1109/WMVC.2007.12","DOIUrl":"https://doi.org/10.1109/WMVC.2007.12","url":null,"abstract":"Recognizing human activity from a stream of sensory observations is important for a number of applications such as surveillance and human-computer interaction. Hidden Markov Models (HMMs) have been proposed as suitable tools for modeling the variations in the observations for the same action and for discriminating among different actions. HMMs have come in wide use for this task but the standard form suffers from several limitations. These include unrealistic models for the duration of a sub-event and not encoding interactions among multiple agents directly. Semi- Markov models and coupled HMMs have been proposed in previous work to handle these issues. We combine these two concepts into a coupled Hidden semi-Markov Model (CHSMM). CHSMMs pose huge computational complexity challenges. We present efficient algorithms for learning and decoding in such structures and demonstrate their utility by experiments with synthetic and real data.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125385596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}