{"title":"A structured probabilistic model for recognition","authors":"C. Schmid","doi":"10.1109/CVPR.1999.784725","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784725","url":null,"abstract":"In this paper we derive a probabilistic model for recognition based on local descriptors and spatial relations between these descriptors. Our model takes into account the variability of local descriptors, their saliency as well as the probability of spatial configurations. It is structured to clearly separate the probability of point-wise correspondences from the spatial coherence of sets of correspondences. For each descriptor of the query image, several correspondences in the image database exist. Each of these point-wise correspondences is weighted by its variability and its saliency. We then search for sets of correspondences which reinforce each other, that is which are spatially coherent. The recognized model is the one which obtains the highest evidence from these sets. To validate our probabilistic model, it is compared to an existing method for image retrieval. The experimental results are given for a database containing more than 1000 images. They clearly show the significant gain obtained by adding the probabilistic model.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86066183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pose clustering with density estimation and structural constraints","authors":"S. Moss, E. Hancock","doi":"10.1109/CVPR.1999.784613","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784613","url":null,"abstract":"This paper describes a statistical framework for object alignment by pose clustering. The idea underlying pose clustering is to transform the alignment process from the image domain to that of the appropriate transformation parameters. It commence by taking k-tuples from the primitive-sets for the model and the data. The size of the k-tuples is such that there are sufficient measurements available to estimate the full-set of transformation parameters. By pairing each k-tuple in the model and each k-tuple in the data, a set of transformation parameter estimates or alignment votes is accumulated. The work reported here draws on three ideas. Firstly, we estimate maximum likelihood alignment parameters by using the the EM algorithm to fit a mixture model to the set of transformation parameter votes. Secondly, we control the order of the underlying structure model using a minimum description length criterion. Finally, we limit problems of combinatorial background by imposing structural constraints on the k-tuples.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82740236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Eigenshapes for 3D object recognition in range data","authors":"Richard J. Campbell, P. Flynn","doi":"10.1109/CVPR.1999.784728","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784728","url":null,"abstract":"Much of the recent research in object recognition has adopted an appearance-based scheme, wherein objects to be recognized are represented as a collection of prototypes in a multidimensional space spanned by a number of characteristic vectors (eigen-images) obtained from training views. In this paper, we extend the appearance-based recognition scheme to handle range (shape) data. The result of training is a set of 'eigensurfaces' that capture the gross shape of the objects. These techniques are used to form a system that recognizes objects under an arbitrary rotational pose transformation. The system has been tested on a 20 object database including free-form objects and a 54 object database of manufactured parts. Experiments with the system point out advantages and also highlight challenges that must be studied in future research.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91491224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting textured vertical facades from controlled close-range imagery","authors":"S. Coorg, S. Teller","doi":"10.1109/CVPR.1999.787004","DOIUrl":"https://doi.org/10.1109/CVPR.1999.787004","url":null,"abstract":"We are developing a system to extract geodetic, textured CAD models from thousands of initially uncontrolled, close-range ground and aerial images of urban scenes. Here we describe one component of the system, which operates after the imagery has been controlled or geo-referenced. This fully automatic component detects significant vertical facades in the scene, then extrudes them to meet an inferred, triangulated terrain and procedurally generated roof polygons. The algorithm then estimates for each surface a computer graphics texture, or diffuse reflectance map, from the many available observations of that surface. We present the results of the algorithm on a complex dataset: nearly 4,000 high-resolution digital images of a small (200 meter square) office park, acquired from close range under highly varying lighting conditions, amidst significant occlusion due both to multiple inter-occluding structures, and dense foliage. While the results are of less fidelity than that would be achievable by an interactive system, our algorithm is the first to be demonstrated on such a large, real-world dataset.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88841293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Norm/sup 2/-based face recognition","authors":"D. B. Graham, N. Allinson","doi":"10.1109/CVPR.1999.786998","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786998","url":null,"abstract":"Increasingly the problems of recognising faces under a variety of viewing conditions, including depth rotations, is being considered in the field. The concept of norm-based coding in face recognition is not new but has been little investigated in machine models. Here we describe a norm-based face recognition system which is capable of generalising from a single training view to recognise novel views of target faces. The system is based upon the characteristic nature faces as they move through a pose-varying eigenspace of facial images and deviations from the norm of a gallery of face images. We illustrate the use of the technique for a large range of pose variation.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89144175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Q-warping: direct computation of quadratic reference surfaces","authors":"A. Shashua, Y. Wexler","doi":"10.1109/CVPR.1999.786960","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786960","url":null,"abstract":"We consider the problem of wrapping around an object, of which two views are available, a reference surface and recovering the resulting parametric flow using direct computations (via spatio-temporal derivatives). The well known examples are affine flow models and B-parameter flow models - both describing a flow field of a planar reference surface. We extend those classic flow models to deal with a quadric reference surface and work out the explicit parametric form of the flow field. As a result we derive a simple warping algorithm that maps between two views and leaves a residual flow proportional to the 30 deviation of the surface from a virtual quadric surface. The applications include image morphing, model building, image stabilization, and disparate view correspondence.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91436785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A simple technique for self-calibration","authors":"Paulo R. S. Mendonça, R. Cipolla","doi":"10.1109/CVPR.1999.786984","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786984","url":null,"abstract":"This paper introduces an extension of Hartley's self-calibration technique based on properties of the essential matrix, allowing for the stable computation of varying focal lengths and principal point. It is well known that the three singular values of an essential must satisfy two conditions: one of them must be zero and the other two must be identical. An essential matrix is obtained from the fundamental matrix by a transformation involving the intrinsic parameters of the pair of cameras associated with the two views. Thus, constraints on the essential matrix can be translated into constraints on the intrinsic parameters of the pair of cameras. This allows for a search in the space of intrinsic parameters of the cameras in order to minimize a cost function related to the constraints. This approach is shown to be simpler than other methods, with comparable accuracy in the results. Another advantage of the technique is that it does not require as input a consistent set of weakly calibrated camera matrices (as defined by Harley) for the whole image sequence, i.e. a set of cameras consistent with the correspondences and known up to a projective transformation.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80726246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual recognition of multi-agent action using binary temporal relations","authors":"S. Intille, A. Bobick","doi":"10.1109/CVPR.1999.786917","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786917","url":null,"abstract":"A probabilistic framework for representing and visually recognizing complex multi-agent action is presented. Motivated by work in model-based object recognition and designed for the recognition of action from visual evidence, the representation has three components: (1) temporal structure descriptions representing the temporal relationships between agent goals, (2) belief networks for probabilistically representing and recognizing individual agent goals from visual evidence, and (3) belief networks automatically generated from the temporal structure descriptions that support the recognition of the complex action. We describe our current work on recognizing American football plays from noisy trajectory data.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81165426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Invariant recognition in hyperspectral images","authors":"G. Healey, D. Slater","doi":"10.1109/CVPR.1999.786975","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786975","url":null,"abstract":"The spectral radiance measured for a material by an airborne hyperspectral sensor depends strongly on. The illumination environment and the atmospheric conditions. This dependence has limited the success of material identification algorithms that rely exclusively on the information contained in hyperspectral image data. In this paper we use a comprehensive physical model to show that the set of observed 0.4-2.5 /spl mu/m spectral radiance vectors for a material lies in a lour-dimensional subspace of the hyperspectral measurement space. The physical model captures the dependence of reflected sunlight, reflected skylight, and path radiance terms on the scene geometry and on the distribution of atmospheric gases and aerosols over a wide range of conditions. Using the subspace model, we develop a local maximum likelihood algorithm for automated material identification that is invariant to illumination, atmospheric conditions, and the scene geometry. We demonstrate the invariant algorithm for the automated identification of material samples in HYDICE imagery acquired under different illumination and atmospheric conditions.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84758030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stereo panorama with a single camera","authors":"Shmuel Peleg, M. Ben-Ezra","doi":"10.1109/CVPR.1999.786969","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786969","url":null,"abstract":"Full panoramic images, covering 360 degrees, can be created either by using panoramic cameras or by mosaicing together many regular images. Creating panoramic views in stereo, where one panorama is generated for the left eye, and another panorama is generated for the right eye is more problematic. Earlier attempts to mosaic images from a rotating pair of stereo cameras faced severe problems of parallax and of scale changes. A new family of multiple viewpoint image projections, the Circular Projections, is developed. Two panoramic images taken using such projections can serve as a panoramic stereo pair. A system is described to generates a stereo panoramic image using circular projections from images or video taken by a single rotating camera. The system works in real-time on a PC. It should be noted that the stereo images are created without computation of 3D structure, and the depth effects are created only in the viewer's brain.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84779790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}