{"title":"Learning and evaluating visual features for pose estimation","authors":"Robert Sim, G. Dudek","doi":"10.1109/ICCV.1999.790419","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790419","url":null,"abstract":"We present a method for learning a set of visual landmarks which are useful for pose estimation. The landmark learning mechanism is designed to be applicable to a wide range of environments, and generalized for different approaches to computing a pose estimate. Initially, each landmark is detected as a focal extremum of a measure of distinctiveness and represented by a principal components encoding which is exploited for matching. Attributes of the observed landmarks can be parameterized using a generic parameterization method and then evaluated in terms of their utility for pose estimation. We present experimental evidence that demonstrates the utility of the method.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125489938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A cluster-based statistical model for object detection","authors":"Thomas D. Rikert, Michael J. Jones, Paul A. Viola","doi":"10.1109/ICCV.1999.790386","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790386","url":null,"abstract":"This paper presents an approach to object detection which is based on recent work in statistical models for texture synthesis and recognition. Our method follows the texture recognition work of De Bonet and Viola (1998). We use feature vectors which capture the joint occurrence of local features at multiple resolutions. The distribution of feature vectors for a set of training images of an object class is estimated by clustering the data and then forming a mixture of Gaussian models. The mixture model is further refined by determining which clusters are the most discriminative for the class and retaining only those clusters. After the model is learned, test images are classified by computing the likelihood of their feature vectors with respect to the model. We present promising results in applying our technique to face detection and car detection.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129562647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D articulated models and multi-view tracking with silhouettes","authors":"Quentin Delamarre, O. Faugeras","doi":"10.1109/ICCV.1999.790292","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790292","url":null,"abstract":"We propose a method to estimate the motion of a person filmed by two or more fixed cameras. The novelty of our technique is its ability to cope with fast movements, self-occlusions and noisy images. Our algorithms are based on the latest works on calibration and image segmentation developed in our lab. We compare the projections of a 3D model of a person on the images to the detected silhouettes of the person, and by creating forces that will move the 3D model towards the final estimation of the real pose. We developed a fast algorithm that computes the motion of the articulated 3D model. We show that our results are good, even if the cameras are not synchronized.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129813772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recovery and tracking of continuous 3D surfaces from stereo data using a deformable dual-mesh","authors":"Y. S. Akgul, C. Kambhamettu","doi":"10.1109/ICCV.1999.790299","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790299","url":null,"abstract":"We propose a novel method for continuous 3D depth recovery and tracking using calibrated stereo. The method integrates stereo correspondence, surface reconstruction and tracking by using a new single deformable dual mesh optimization, resulting in simplicity, robustness and efficiency. In order to combine stereo correspondence and structure recovery, the method introduces an external energy function defined for a 3D volume based on cross-correlation between the stereo pairs. The internal energy functional of the deformable dual mesh imposes smoothness on the surfaces and it serves as a communication tool between the two meshes. Under the forces produced by the energy terms, the dual mesh deforms to recover and track the 3D surface. The newly introduced dual-mesh model, which is one of the main contributions of this paper, makes the system robust against local minima and yet it is efficient. A coarse-to-fine minimization approach makes the system even more efficient. Tracking is achieved by using the recovered surface as an initial position for the next time frame. Although the system can effectively utilize initial surface positions and disparity data, they are not needed for a successful operation, which makes this system applicable to a wide range of areas. We present the results of a number of experiments on stereo human face and cloud images, which proves that our new method is very effective.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129662670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coupled lighting direction and shape estimation from single images","authors":"D. Samaras, Dimitris N. Metaxas","doi":"10.1109/ICCV.1999.790313","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790313","url":null,"abstract":"This paper presents a new method for the simultaneous estimation of lighting direction and shape from shading. The method estimates the shape and the lighting direction using a two step iterative process. We assume an initial (possibly incorrect) estimate of the lighting position. A stiff deformable model is then fitted to the image, assuming this lighting position. Next, a least-squares estimate of the lighting position is derived from the model using the Levenberg-Marquart method. The two steps-model fitting and lighting-position estimation-are iterated. Once the light direction has converged to a stable solution the deformable model stiffness is lowered and the model fits accurately given the lighting model. In addition, we show how the method can be used with either orthographic or perspective projection assumptions. In a variety of experiments on real and synthetic data, the method is robust to errors both to the initial light position and shape estimates.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123830662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cluster-based segmentation of natural scenes","authors":"E. Pauwels, Greet Frederix","doi":"10.1109/ICCV.1999.790377","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790377","url":null,"abstract":"In cluster-based segmentation pixels are mapped into various feature spaces whereupon they are subjected to a grouping algorithm. In this paper we develop a robust and versatile non-parametric clustering algorithm that is able to handle the unbalanced and irregular clusters encountered in such segmentation applications. The strength of our approach lies in the definition and use of two cluster validity indices that are independent of the cluster topology. By combining them, an excellent clustering can be identified, and experiments confirm that the associated clusters do indeed correspond to perceptually salient image regions.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123891975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic feature ordering for efficient registration","authors":"Tat-Jen Cham, James M. Rehg","doi":"10.1109/ICCV.1999.790395","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790395","url":null,"abstract":"Existing sequential feature based registration algorithms involving search typically either select features randomly (e.g. the RANSAC approach (M. Fischler and R. Bolles, 1981)) or assume a predefined, intuitive ordering for the features (e.g. based on size or resolution). The paper presents a formal framework for computing an ordering for features which maximizes search efficiency. Features are ranked according to matching ambiguity measure, and an algorithm is proposed which couples the feature selection with the parameter estimation, resulting in a dynamic feature ordering. The analysis is extended to template features where the matching is non discrete and a sample refinement process is proposed. The framework is demonstrated effectively on the localization of a person in an image, using a kinematic model with template features. Different priors are used on the model parameters and the results demonstrate nontrivial variations in the optimal feature hierarchy.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114065359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Marengoni, A. Hanson, S. Zilberstein, E. Riseman
{"title":"Control in a 3D reconstruction system using selective perception","authors":"M. Marengoni, A. Hanson, S. Zilberstein, E. Riseman","doi":"10.1109/ICCV.1999.790421","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790421","url":null,"abstract":"This paper presents a control structure for general purpose image understanding that addresses both the high level of uncertainty in local hypotheses and the computational complexity of image interpretation. The control of vision algorithms is performed by an independent subsystem that uses Bayesian networks and utility theory to compute the marginal value of information provided by alternative operators and selects the ones with the highest value. We have implemented and tested this control structure with several aerial image datasets. The results show that the knowledge base used by the system can be acquired using standard learning techniques and that the value-driven approach to the selection of vision algorithms leads to performance gains. Moreover, the modular system architecture simplifies the addition of both control knowledge and new vision algorithms.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114372619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time motion analysis with linear-programming","authors":"M. Ben-Ezra, Shmuel Peleg, M. Werman","doi":"10.1109/ICCV.1999.790290","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790290","url":null,"abstract":"A method to compute motion models in real time from point-to-line correspondences using linear programming is presented. Point-to-line correspondences are the most reliable motion measurements given the aperture effect, and it is shown how they can approximate other motion measurements as well. Using an L/sub 1/ error measure for image alignment based on point-to-line correspondences and minimizing this measure using linear programming, achieves results which are more robust than the commonly used L/sub 2/ metric. While estimators based on L/sub 1/ are not theoretically robust, experiments show that the proposed method is robust enough to allow accurate motion recovery in hundreds of consecutive frames. The entire computation is performed in real-time on a PC with no special hardware.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124144686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inherent two-way ambiguity in 2D projective reconstruction from three uncalibrated 1D images","authors":"Long Quan","doi":"10.1109/ICCV.1999.791240","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791240","url":null,"abstract":"It is shown that there always exists a two-way ambiguity for 2D projective reconstruction from three uncalibrated 1D views independent of the number of point correspondences. It is also shown that the two distinct projective reconstructions are exactly related by a quadratic transformation with the three camera centers as the fundamental points. The unique reconstruction exists only for the case where the three camera centers are aligned. The theoretical results are demonstrated on numerical examples.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132786833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}