{"title":"Fast approximate energy minimization via graph cuts","authors":"Yuri Boykov, O. Veksler, R. Zabih","doi":"10.1109/ICCV.1999.791245","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791245","url":null,"abstract":"In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function's smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an /spl alpha/-/spl beta/-swap: for a pair of labels /spl alpha/,/spl beta/, this move exchanges the labels between an arbitrary set of pixels labeled a and another arbitrary set labeled /spl beta/. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an /spl alpha/-expansion: for a label a, this move assigns an arbitrary set of pixels the label /spl alpha/. Our second algorithm, which requires the smoothness term to be a metric, generates a labeling such that there is no expansion move that decreases the energy. Moreover, this solution is within a known factor of the global minimum. We experimentally demonstrate the effectiveness of our approach on image restoration, stereo and motion.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"43 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120887809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Haritaoglu, Ross Cutler, David Harwood, L. Davis
{"title":"Backpack: detection of people carrying objects using silhouettes","authors":"I. Haritaoglu, Ross Cutler, David Harwood, L. Davis","doi":"10.1109/ICCV.1999.791204","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791204","url":null,"abstract":"We described a video-rate surveillance algorithm to detect and track people from a stationary camera, and to determine if they are carrying objects or moving unencumbered. The contribution of the paper is the shape analysis algorithm that both determines if a person is carrying an object and segments the object from the person so that it can be tracked, e.g., during an exchange of objects between two people. As the object is segmented an appearance model of the object is constructed. The method combines periodic motion estimation with static symmetry analysis of the silhouettes of a person in each frame of the sequence. Experimental results demonstrate robustness and real-time performance of the proposed algorithm.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132327288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Specularities on surfaces with tangential hairs or grooves","authors":"R. Lu, J. Koenderink, A. Kappers","doi":"10.1109/ICCV.1999.791189","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791189","url":null,"abstract":"Specularities on surfaces with tangential hairs or grooves are readily observable in nature. Examples of such phenomena are the arched or looped highlights observed on horses and on human heads and the linear or curved specularities observed on parts of industrial machinery that have tangential grooves. We investigate the geometry of curvilinear specularities on surfaces of different curvature with tangential hairs or grooves of different orientation, under controlled lighting and viewing conditions. First the nature of these specularities is investigated qualitatively. Then specularities on parametric surfaces and hair or groove orientations are calculated for some specific cares. Explicit calculations of specularities on some special surfaces, cylinders, cones, and spheres, are verified by photographs of the reflections. Aspects of the work are applicable to computer graphics and can be utilized for the image interpretation of surface specularities.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130329710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of human body motion","authors":"J. Rittscher, A. Blake","doi":"10.1109/ICCV.1999.791284","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791284","url":null,"abstract":"The classification of human body motion is a difficult problem. In particular, the automatic segmentation of image sequences containing more than one class of motion is challenging. An effective approach is to use mixed discrete/continuous states to couple perception with classification. A spline contour is used to track the outline of the person. We show that, for a quasi-periodic human body motion, an autoregressive process is a suitable model for the contour dynamics. This can then be used as a dynamical model for mixed-state \"condensation\" filtering, switching automatically between different motion classes. We have developed \"partial importance sampling\" to enhance the efficiency of the mixed-state condensation filter. It is also shown that the importance sampling can be done in linear time, instead of the previous quadratic algorithm. \"Tying\" of discrete states is used to obtain further efficiency improvements. Automatic segmentation is demonstrated on video sequences of aerobic exercises. The performance is promising, but there remains a residual misclassification rate, and possible explanations for this are discussed.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129128155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate tree matching and shape similarity","authors":"Tyng-Luh Liu, D. Geiger","doi":"10.1109/ICCV.1999.791256","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791256","url":null,"abstract":"We present a framework for 2D shape contour (silhouette) comparison that can account for stretchings, occlusions and region information. Topological changes due to the original 3D scenarios and articulations are also addressed. To compare the degree of similarity between any two shapes, our approach is to represent each shape contour with a free tree structure derived from a shape axis (SA) model, which we have recently proposed. We then use a tree matching scheme to find the best approximate match and the matching cost. To deal with articulations, stretchings and occlusions, three local tree matching operations, merge, cut, and merge-and-cut, are introduced to yield optimally approximate matches, which can accommodate not only one-to-one but many-to-many mappings. The optimization process gives guaranteed globally optimal match efficiently. Experimental results on a variety of shape contours are provided.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124750277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking self-occluding articulated objects in dense disparity maps","authors":"N. Jojic, M. Turk, Thomas S. Huang","doi":"10.1109/ICCV.1999.791207","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791207","url":null,"abstract":"In this paper, we present an algorithm for real-time tracking of articulated structures in dense disparity maps derived from stereo image sequences. A statistical image formation model that accounts for occlusions plays the central role in our tracking approach. This graphical model (a Bayesian network) assumes that the range image of each part of the structure is formed by drawing the depth candidates from a 3-D Gaussian distribution. The advantage over the classical mixture of Gaussians is that our model takes into account occlusions by picking the minimum depth (which could be regarded as a probabilistic version of z-buffering). The model also enforces articulation constraints among the parts of the structure. The tracking problem is formulated as an inference problem in the image formation model. This model can be extended and used for other tasks in addition to the one described in the paper and can also be used for estimating probability distribution functions instead of the ML estimates of the tracked parameters. For the purposes of real-time tracking, we used certain approximations in the inference process, which resulted in a real-time two-stage inference algorithm. We were able to successfully track upper human body motion in real time and in the presence of self-occlusions.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126627373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Appearance compression and synthesis based on 3D model for mixed reality","authors":"K. Nishino, Yoichi Sato, K. Ikeuchi","doi":"10.1109/ICCV.1999.791195","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791195","url":null,"abstract":"Rendering photorealistic virtual objects from their real images is one of the main research issues in mixed reality systems. We previously proposed the Eigen-Texture method (K. Nishino et al., 1999), a new rendering method for generating virtual images of objects from their real images to deal with the problems posed by past work in image based methods and model based methods. Eigen-Texture method samples appearances of a real object under various illumination and viewing conditions, and compresses them in the 2D coordinate system defined on the 3D model surface. However, we had a serious limitation in our system, due to the alignment problem of the 3D model and color images. We deal with this limitation by solving the alignment problem; we do this by using the method originally designed by P. Viola (1995). The paper describes the method and reports on how we implement it.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131170293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object recognition from local scale-invariant features","authors":"D. Lowe","doi":"10.1109/ICCV.1999.790410","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790410","url":null,"abstract":"An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116755781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differential matching constraints","authors":"B. Triggs","doi":"10.1109/ICCV.1999.791244","DOIUrl":"https://doi.org/10.1109/ICCV.1999.791244","url":null,"abstract":"We introduce a finite difference expansion for closely spaced cameras in projective vision, and use it to derive differential analogues of the finite-displacement projective matching tensors and constraints. The results are simpler, more general and easier to use than Astrom & Heyden's time-derivative based 'continuous time matching constraints'. We suggest how to use the formalism for 'tensor tracking'-propagation of matching relations against a fixed base image along an image sequence. We relate this to non-linear tensor estimators and show how 'unwrapping the optimization loop' along the sequence allows simple 'linear n point' update estimates to converge rapidly to statistically near-optimal, near-consistent tensor estimates as the sequence proceeds. We also give guidelines as to when difference expansion is likely to be worthwhile as compared to a discrete approach.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120948948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Direction diffusion","authors":"Bei Tang, G. Sapiro, Vicent Caselles","doi":"10.1109/ICCV.1999.790423","DOIUrl":"https://doi.org/10.1109/ICCV.1999.790423","url":null,"abstract":"In a number of disciplines, directional data provides a fundamental source of information. A novel framework for isotropic and anisotropic diffusion of directions is presented in this paper. The framework can be applied both to regularize directional data and to obtain multiscale representations of it. The basic idea is to apply and extend results from the theory of harmonic maps in liquid crystals. This theory deals with the regularization of vectorial data, while satisfying the unit norm constraint of directional data. We show the corresponding variational and partial differential equations formulations for isotropic diffusion, obtained from an L/sub 2/ norm, and edge preserving diffusion, obtained from an L/sub 1/ norm. In contrast with previous approaches, the framework is valid for directions in any dimensions, supports non-smooth data, and gives both isotropic and anisotropic formulations. We present a number of theoretical results, open questions, and examples for gradient vectors, optical flow, and color images.","PeriodicalId":358754,"journal":{"name":"Proceedings of the Seventh IEEE International Conference on Computer Vision","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125175750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}