S. Apewokin, B. Valentine, M. R. Bales, L. Wills, D. S. Wills
{"title":"Tracking multiple pedestrians in real-time using kinematics","authors":"S. Apewokin, B. Valentine, M. R. Bales, L. Wills, D. S. Wills","doi":"10.1109/CVPRW.2008.4563149","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563149","url":null,"abstract":"We present an algorithm for real-time tracking of multiple pedestrians in a dynamic scene. The algorithm is targeted for embedded systems and reduces computational and storage costs by using an inexpensive kinematic tracking model with only fixed-point arithmetic representations. Our algorithm leverages from the observation that pedestrians in a dynamic scene tend to move with uniform speed over a small number of consecutive frames. We use a multimodal background modeling technique to accurately segment the foreground (moving people) from the background. We then use connectivity analysis to identify blobs in the foreground and calculate the center of mass of each blob. Finally, we establish correspondence between the center of mass of each blob in the current frame with center of mass information gathered from the two immediately preceding frames. We evaluate our algorithm on a real outdoor video sequence taken with an inexpensive webcam. Our implementation successfully tracks each pedestrian from frame to frame in real-time. Our algorithm performs well in challenging situations resulting from occlusion and crowded conditions, running on an eBox-2300 Thin Client VESA PC.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129846784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pablo Márquez-Neila, Jacobo Garcia Miro, J. M. Buenaposada, L. Baumela
{"title":"Improving RANSAC for fast landmark recognition","authors":"Pablo Márquez-Neila, Jacobo Garcia Miro, J. M. Buenaposada, L. Baumela","doi":"10.1109/CVPRW.2008.4563138","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563138","url":null,"abstract":"We introduce a procedure for recognizing and locating planar landmarks for mobile robot navigation, based in the detection and recognition of a set of interest points. We use RANSAC for fitting a homography and locating the landmark. Our main contribution is the introduction of a geometrical constraint that reduces the number of RANSAC iterations by discarding minimal subsets. In the experiments conducted we conclude that this constraint increases RANSAC performance by reducing in about 35% and 75% the number of iterations for affine and projective cameras, respectively.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129900544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A stable optic-flow based method for tracking colonoscopy images","authors":"Jianfei Liu, K. Subramanian, T. Yoo, R. V. Uitert","doi":"10.1109/CVPRW.2008.4562990","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4562990","url":null,"abstract":"In this paper, we focus on the robustness and stability of our algorithm to plot the position of an endoscopic camera (during a colonoscopy procedure) on the corresponding pre-operative CT scan of the patient. The colon has few topological landmarks, in contrast to bronchoscopy images, where a number of registration algorithms have taken advantage of features such as anatomical marks or bifurcations. Our method estimates the camera motion from the optic-flow computed from the information contained in the video stream. Optic-flow computation is notoriously susceptible to errors in estimating the motion field. Our method relies on the following features to counter this, (1) we use a small but reliable set of feature points (sparse optic-flow field) to determine the spatio-temporal scale at which to perform optic-flow computation in each frame of the sequence, (2) the chosen scales are used to compute a more accurate dense optic flow field, which is used to compute qualitative parameters relating to the main motion direction, and (3) the sparse optic-flow field and the main motion parameters are then combined to estimate the camera parameters. A mathematical analysis of our algorithm is presented to illustrate the stability of our method, as well as comparison to existing motion estimation algorithms. We present preliminary results of using this algorithm on both a virtual colonoscopy image sequence, as well as a colon phantom image sequence.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130333766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual map matching and localization using a global feature map","authors":"O. Pink","doi":"10.1109/CVPRW.2008.4563135","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563135","url":null,"abstract":"This paper presents a novel method to support environmental perception of mobile robots by the use of a global feature map. While typical approaches to simultaneous localization and mapping (SLAM) mainly rely on an on-board camera for mapping, our approach uses geographically referenced aerial or satellite images to build a map in advance. The current position on the map is determined by matching features from the on-board camera to the global feature map. The problem of feature matching is posed as a standard point pattern matching problem and a solution using the iterative closest point method is given. The proposed algorithm is designed for use in a street vehicle and uses lane markings as features, but can be adapted to almost any other type of feature that is visible in aerial images. Our approach allows for estimating the robot position at a higher precision than by a purely GPS-based localization, while at the same time providing information about the environment far beyond the current field of view.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126836194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ankita Kumar, J. Tardif, Roy Anati, Kostas Daniilidis
{"title":"Experiments on visual loop closing using vocabulary trees","authors":"Ankita Kumar, J. Tardif, Roy Anati, Kostas Daniilidis","doi":"10.1109/CVPRW.2008.4563140","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563140","url":null,"abstract":"In this paper we study the problem of visual loop closing for long trajectories in an urban environment. We use GPS positioning only to narrow down the search area and use pre-built vocabulary trees to find the best matching image in this search area. Geometric consistency is then used to prune out the bad matches. We compare several vocabulary trees on a sequence of 6.5 kilometers. We experiment with hierarchical k-means based trees as well as extremely randomized trees and compare results obtained using five different trees. We obtain the best results using extremely randomized trees. After enforcing geometric consistency the matched images look promising for structure from motion applications.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Allaire, John J. Kim, S. Breen, D. Jaffray, V. Pekar
{"title":"Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis","authors":"S. Allaire, John J. Kim, S. Breen, D. Jaffray, V. Pekar","doi":"10.1109/CVPRW.2008.4563023","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563023","url":null,"abstract":"This paper presents a comprehensive extension of the Scale Invariant Feature Transform (SIFT), originally introduced in 2D, to volumetric images. While tackling the significant computational efforts required by such multiscale processing of large data volumes, our implementation addresses two important mathematical issues related to the 2D-to-3D extension. It includes efficient steps to filter out extracted point candidates that have low contrast or are poorly localized along edges or ridges. In addition, it achieves, for the first time, full 3D orientation invariance of the descriptors, which is essential for 3D feature matching. An application of this technique is demonstrated to the feature-based automated registration and segmentation of clinical datasets in the context of radiation therapy.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114805382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards understanding what makes 3D objects appear simple or complex","authors":"S. Sukumar, D. Page, A. Koschan, M. Abidi","doi":"10.1109/CVPRW.2008.4562975","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4562975","url":null,"abstract":"Humans perceive some objects more complex than others and learning or describing a particular object is directly related to the judged complexity. Towards the goal of understanding why the geometry of some 3D objects appear more complex than others, we conducted a psychophysical study and identified contributing attributes. Our experiments conclude that surface variation, symmetry, part count, simpler part decomposability, intricate details and topology are six significant dimensions that influence 3D visual shape complexity. With that knowledge, we present a method of quantifying complexity and show that the informational aspect of Shannonpsilas theory agrees with the human notion of shape complexity.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121445787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CUDA cuts: Fast graph cuts on the GPU","authors":"Vibhav Vineet, P J Narayanan","doi":"10.1109/CVPRW.2008.4563095","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563095","url":null,"abstract":"Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to compute graph-cuts is computationally heavy. The best-reported implementation of graph cuts takes over 100 milliseconds even on images of size 640times480 and cannot be used for real-time applications or when iterated applications are needed. The commodity Graphics Processor Unit (GPU) has emerged as an economical and fast computation co-processor recently. In this paper, we present an implementation of the push-relabel algorithm for graph cuts on the GPU. We can perform over 60 graph cuts per second on 1024times1024 images and over 150 graph cuts per second on 640times480 images on an Nvidia 8800 GTX. The time for each complete graph-cut is about 1 millisecond when only a few weights change from the previous graph, as on dynamic graphs resulting from videos. The CUDA code with a well-defined interface can be downloaded for anyonepsilas use.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131366559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic facial expression recognition for intelligent tutoring systems","authors":"J. Whitehill, M. Bartlett, J. Movellan","doi":"10.1109/CVPRW.2008.4563182","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563182","url":null,"abstract":"This project explores the idea of facial expression for automated feedback in teaching. We show how automatic realtime facial expression recognition can be effectively used to estimate the difficulty level, as perceived by an individual student, of a delivered lecture. We also show that facial expression is predictive of an individual studentpsilas preferred rate of curriculum presentation at each moment in time. On a video lecture viewing task, training on less than two minutes of recorded facial expression data and testing on a separate validation set, our system predicted the subjectspsila self-reported difficulty scores with mean accuracy of 0:42 (Pearson R) and their preferred viewing speeds with mean accuracy of 0:29. Our techniques are fully automatic and have potential applications for both intelligent tutoring systems (ITS) and standard classroom environments.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"42 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134024172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hai-yun Xu, R. Veldhuis, T. Kevenaar, A. Akkermans, A. Bazen
{"title":"Spectral minutiae: A fixed-length representation of a minutiae set","authors":"Hai-yun Xu, R. Veldhuis, T. Kevenaar, A. Akkermans, A. Bazen","doi":"10.1109/CVPRW.2008.4563120","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563120","url":null,"abstract":"Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However, a minutiae set is an unordered set and the minutiae locations suffer from various deformations such as translation, rotation and scaling. In this paper, we introduce a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for. By applying the spectral minutiae representation, we can combine the fingerprint recognition system with a template protection scheme, which requires a fixed-length feature vector. This paper also presents two spectral minutiae matching algorithms and shows experimental results.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133987448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}