{"title":"A Factorized Recursive Estimation of Structure and Motion from Image Velocities","authors":"Adel H. Fakih, J. Zelek","doi":"10.1109/CRV.2007.2","DOIUrl":"https://doi.org/10.1109/CRV.2007.2","url":null,"abstract":"We propose a new approach for the recursive estimation of structure and motion from image velocities. The estimation of structure and motion from image velocities is preferred to the estimation from pixel correspondences when the image displacements are small, since the former approach provides a stronger constraint being based on the instantaneous equation of rigid bodies motion. However the recursive estimation when dealing with image velocities is harder than its counterpart (in the case of pixel correspondences) since the number of points is usually larger and the equations are more involved. For this reason, in contrast to the case of point correspondences, the approaches presented so far are mostly limited to assuming a known 3D motion, or estimating the motion and structure independently. The approach presented in this paper introduces a factorized particle filter for estimating simultaneously the 3D motion and depth. Each particle consists of a 3D motion and a set of probability distributions of the depths of the pixels. The recursive estimation is done in three stages. (1) a resampling and a prediction of new samples; (2) a recursive filtering of the individual depths distributions performed using Extended Kalman Filters; and (3)finally a reweighting of the particles based on the image measurement. Results on simulation data show the efficiency of the approach. Future work will focus on incorporating an estimation of object boundaries to be used in a following regularization step.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125052663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constructing Face Image Logs that are Both Complete and Concise","authors":"Adam Fourney, R. Laganière","doi":"10.1109/CRV.2007.20","DOIUrl":"https://doi.org/10.1109/CRV.2007.20","url":null,"abstract":"This paper describes a construct that we call a face image log. Face image logs are collections of time stamped images representing faces detected in surveillance videos. The techniques demonstrated in this paper strive to construct face image logs that are complete and concise in the sense that the logs contain only the best images available for each individual observed. We begin by describing how to assess and compare the quality of face images. We then illustrate a robust method for selecting high quality images. This selection process takes into consideration the limitations inherent in existing face detection and person tracking techniques. Experimental results demonstrate that face logs constructed in this manner generally contain fewer than 5% of all detected faces, yet these faces are of high quality, and they represent all individuals detected in the video sequence.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125346879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Saccadic Gaze Control via Motion Prediciton","authors":"Per-Erik Forssén","doi":"10.1109/CRV.2007.42","DOIUrl":"https://doi.org/10.1109/CRV.2007.42","url":null,"abstract":"This paper describes a system that autonomously learns to perform saccadic gaze control on a stereo pan-tilt unit. Instead of learning a direct map from image positions to a centering action, the system first learns a forward model that predicts how image features move in the visual field as the gaze is shifted. Gaze control can then be performed by searching for the action that best centers a feature in both the left and the right image. By attacking the problem in a different way we are able to collect many training examples in each action, and thus learning converges much faster. The learning is performed using image features obtained from the scale invariant feature transform (SIFT) detected and matched before and after a saccade, and thus requires no special environment during the training stage. We demonstrate that our system stabilises already after 300 saccades, which is more than 100 times fewer than the best current approaches.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122525187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Framework for 3D Hand Tracking and Gesture Recognition using Elements of Genetic Programming","authors":"A. El-Sawah, C. Joslin, N. Georganas, E. Petriu","doi":"10.1109/CRV.2007.3","DOIUrl":"https://doi.org/10.1109/CRV.2007.3","url":null,"abstract":"In this paper we present a framework for 3D hand tracking and dynamic gesture recognition using a single camera. Hand tracking is performed in a two step process: we first generate 3D hand posture hypothesis using geometric and kinematics inverse transformations, and then validate the hypothesis by projecting the postures on the image plane and comparing the projected model with the ground truth using a probabilistic observation model. Dynamic gesture recognition is performed using a Dynamic Bayesian Network model. The framework utilizes elements of soft computing to resolve the ambiguity inherent in vision-based tracking by producing a fuzzy hand posture output by the hand tracking module and feeding back potential posture hypothesis from the gesture recognition module.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132318206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle Tracking and Distance Estimation Based on Multiple Image Features","authors":"Yixin Chen, M. Das, D. Bajpai","doi":"10.1109/CRV.2007.68","DOIUrl":"https://doi.org/10.1109/CRV.2007.68","url":null,"abstract":"In this paper, we introduce a vehicle tracking algorithm based on multiple image features to detect and track the front car in a collision avoidance system (CAS) application. The algorithm uses multiple image features, such as corner, edge, gradient, vehicle symmetry property, and image matching technique to robustly detect the vehicle bottom corners and edges, and estimate the vehicle width. Based on the estimated vehicle width, a few pre-selected edge templates are used to match the image edges that allow us to estimate the vehicle height, and also the distance between the front vehicle and the host vehicle. Some experimental results based on real world video images are presented. These seem to indicate that the algorithm is capable of identifying a front vehicle, tracking it, and estimating its distance from the host vehicle.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134348492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Feature Selection For Object Segmentation and Tracking","authors":"M. S. Allili, D. Ziou","doi":"10.1109/CRV.2007.67","DOIUrl":"https://doi.org/10.1109/CRV.2007.67","url":null,"abstract":"Most image segmentation algorithms in the past are based on optimizing an objective function that aims to achieve the similarity between several low-level features to build a partition of the image into homogeneous regions. In the present paper, we propose to incorporate the relevance (selection) of the grouping features to enforce the segmentation toward the capturing of objects of interest. The relevance of the features is determined through a set of positive and negative examples of a specific object defined a priori by the user. The calculation of the relevance of the features is performed by maximizing an objective function defined on the mixture likelihoods of the positive and negative object examples sets. The incorporation of the features relevance in the object segmentation is formulated through an energy functional which is minimized by using level set active contours. We show the efficiency of the approach on several examples of object of interest segmentation and tracking where the features relevance was used.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114626055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new segmentation method for MRI images of the shoulder joint","authors":"N. Nguyen, D. Laurendeau, A. Albu","doi":"10.1109/CRV.2007.4","DOIUrl":"https://doi.org/10.1109/CRV.2007.4","url":null,"abstract":"This paper presents an integrated region-based and gradient-based supervised method for segmentation of a patient magnetic resonance images (MRI) of the shoulder joint. The method is noninvasive, anatomy-based and requires only simple user interaction. It is generic and easily customizable for a variety of routine clinical uses in orthopedic surgery.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128214094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Oriented-Filters Based Head Pose Estimation","authors":"M. Dahmane, J. Meunier","doi":"10.1109/CRV.2007.48","DOIUrl":"https://doi.org/10.1109/CRV.2007.48","url":null,"abstract":"The aim of this study is to elaborate and validate a methodology to automatically assess head orientation with respect to a camera in a video sequence. The proposed method uses relatively stable facial features (upper points of the eyebrows, upper nasolabial-furrow corners and nasal root) that have symmetric properties to recover the face slant and tilt angles. These fiducial points are characterized by a bank of steerable filters. Using the frequency domain, we present an elegant formulation to linearly decompose a Gaussian steerable filter into a set of x, y separable basis Gaussian kernels. A practical scheme to estimate the position of the occasionally occluded nasolabial-furrow facial feature is also proposed. Results show that head motion can be estimated with sufficient precision to obtain the gaze direction without camera calibration or any other particular settings are required for this purpose.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130402882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-Time Commercial Recognition Using Color Moments and Hashing","authors":"Abhishek Shivadas, J. Gauch","doi":"10.1109/CRV.2007.53","DOIUrl":"https://doi.org/10.1109/CRV.2007.53","url":null,"abstract":"In this paper, our focus is on real-time commercial recognition. In particular, our goal is to correctly identify all commercials that are stored in our commercial database within the first second of their broadcast. To meet this objective, we make use of 27 color moments to characterize the content of every video frame. This representation is much more compact than most color histogram representations, and it less sensitive to noise and other distortion. We use frame-level hashing with subsequent matching of moment vectors and video frames to perform commercial recognition. Hashing provides constant time access to millions of video frames, so this approach can perform in real-time for databases containing thousands of commercials. In our experiments with a database of 63 commercials, we achieved 96% recall, 100% precision, and 98% utility while recognizing commercials within the first 1/2 second of their broadcast.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127993893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Registration of IR and EO Video Sequences based on Frame Difference","authors":"Zheng Liu, R. Laganière","doi":"10.1109/CRV.2007.56","DOIUrl":"https://doi.org/10.1109/CRV.2007.56","url":null,"abstract":"Multi-modal imaging sensors are employed in advanced surveillance systems in the recent years. The performance of surveillance systems can be enhanced by using information beyond the visible spectrum, for example, infrared imaging. To ensure correctness of low- or high-level processing, multi-modal imagers must be fully calibrated or registered. In this paper, an algorithm is proposed to register the video sequences acquired by an infrared and an electro-optical (CCD) camera. The registration method is based on the silhouette extracted by differencing adjacent frames. This difference is found by an image structural similarity measurement. Initial registration is implemented by tracing the top head points in consecutive frames. Finally, an optimization procedure to maximize mutual information is employed to refine the registration results.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"294 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122788233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}