{"title":"Weighted and robust incremental method for subspace learning","authors":"D. Skočaj, A. Leonardis","doi":"10.1109/ICCV.2003.1238667","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238667","url":null,"abstract":"Visual learning is expected to be a continuous and robust process, which treats input images and pixels selectively. In this paper, we present a method for subspace learning, which takes these considerations into account. We present an incremental method, which sequentially updates the principal subspace considering weighted influence of individual images as well as individual pixels within an image. This approach is further extended to enable determination of consistencies in the input data and imputation of the values in inconsistent pixels using the previously acquired knowledge, resulting in a novel incremental, weighted and robust method for subspace learning.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132474491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Yang, Shankar R. Rao, Kun Huang, Wei Hong, Yi Ma
{"title":"Geometric segmentation of perspective images based on symmetry groups","authors":"A. Yang, Shankar R. Rao, Kun Huang, Wei Hong, Yi Ma","doi":"10.1109/ICCV.2003.1238634","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238634","url":null,"abstract":"Symmetry is an effective geometric cue to facilitate conventional segmentation techniques on images of man-made environment. Based on three fundamental principles that summarize the relations between symmetry and perspective imaging, namely, structure from symmetry, symmetry hypothesis testing, and global symmetry testing, we develop a prototype system which is able to automatically segment symmetric objects in space from single 2D perspective images. The result of such a segmentation is a hierarchy of geometric primitives, called symmetry cells and complexes, whose 3D structure and pose are fully recovered. Such a geometrically meaningful segmentation may greatly facilitate applications such as feature matching and robot navigation.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132531259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cumulative residual entropy, a new measure of information & its application to image alignment","authors":"Fei Wang, B. Vemuri, M. Rao, Yunmei Chen","doi":"10.1109/ICCV.2003.1238395","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238395","url":null,"abstract":"We use the cumulative distribution of a random variable to define the information content in it and use it to develop a novel measure of information that parallels Shannon entropy, which we dub cumulative residual entropy (CRE). The key features of CRE may be summarized as, (1) its definition is valid in both the continuous and discrete domains, (2) it is mathematically more general than the Shannon entropy and (3) its computation from sample data is easy and these computations converge asymptotically to the true values. We define the cross-CRE (CCRE) between two random variables and apply it to solve the uni- and multimodal image alignment problem for parameterized (rigid, affine and projective) transformations. The key strengths of the CCRE over using the now popular mutual information method (based on Shannon's entropy) are that the former has significantly larger noise immunity and a much larger convergence range over the field of parameterized transformations. These strengths of CCRE are demonstrated via experiments on synthesized and real image data.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132591966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Polarization-based inverse rendering from a single view","authors":"D. Miyazaki, R. Tan, K. Hara, K. Ikeuchi","doi":"10.1109/ICCV.2003.1238455","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238455","url":null,"abstract":"This paper presents a method to estimate geometrical, photometrical, and environmental information of a single-viewed object in one integrated framework under fixed viewing position and fixed illumination direction. These three types of information are important to render a photorealistic image of a real object. Photometrical information represents the texture and the surface roughness of an object, while geometrical and environmental information represent the 3D shape of an object and the illumination distribution, respectively. The proposed method estimates the 3D shape by computing the surface normal from polarization data, calculates the texture of the object from the diffuse only reflection component, determines the illumination directions from the position of the brightest intensity in the specular reflection component, and finally computes the surface roughness of the object by using the estimated illumination distribution.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133110668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-frame wide baseline matching","authors":"Jiangjian Xiao, M. Shah","doi":"10.1109/ICCV.2003.1238403","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238403","url":null,"abstract":"We describe a novel approach to automatically recover corresponding feature points and epipolar geometry over two wide baseline frames. Our contributions consist of several aspects: First, the use of an affine invariant feature, edge-corner, is introduced to provide a robust and consistent matching primitives. Second, based on SVD decomposition of affine matrix, the affine matching space between two corners can be approximately divided into two independent spaces by rotation angle and scaling factor. Employing this property, a two-stage affine matching algorithm is designed to obtain robust matches over two frames. Third, using the epipolar geometry estimated by these matches, more corresponding feature points are determined. Based on these robust correspondences, the fundamental matrix is refined, and a series of virtual views of the scene are synthesized. Finally, several experiments are presented to illustrate that a number of robust correspondences can be stably determined for two wide baseline images under significant camera motions with illumination changes, occlusions, and self-similarities. After testing a number of examples and comparing with the existing methods, the experimental results strongly demonstrate that our matching method outperforms the state-of-art algorithms for all of the test cases.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130066055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive dynamic range imaging: optical control of pixel exposures over space and time","authors":"S. Nayar, Vlad Branzoi","doi":"10.1109/ICCV.2003.1238624","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238624","url":null,"abstract":"This paper presents a new approach to imaging that significantly enhances the dynamic range of a camera. The key idea is to adapt the exposure of each pixel on the image detector, based on the radiance value of the corresponding scene point. This adaptation is done in the optical domain, that is, during image formation. In practice, this is achieved using a spatial light modulator whose transmittance can be varied with high resolution over space and time. A real-time control algorithm is developed that uses acquired images to automatically adjust the transmittance function of the spatial modulator. Each captured image and its corresponding transmittance function are used to compute a very high dynamic range image that is linear in scene radiance. We have implemented a video-rate adaptive dynamic range camera that consists of a color CCD detector and a controllable liquid crystal light modulator. Experiments have been conducted in scenarios with complex and harsh lighting conditions. The results indicate that adaptive imaging can have a significant impact on vision applications such as monitoring, tracking, recognition, and navigation.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114502629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new perspective [on] shape-from-shading","authors":"A. Tankus, N. Sochen, Y. Yeshurun","doi":"10.1109/ICCV.2003.1238439","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238439","url":null,"abstract":"Shape-from-shading (SFS) is a fundamental problem in computer vision. The vast majority of research in this field have assumed orthography as its projection model. This paper reexamines the basis of SFS, the image irradiance equation, under an assumption of perspective projection. The paper also shows that the perspective image irradiance equation depends merely on the natural logarithm of the depth function (and not on the depth function itself), and as such it is invariant to scale changes of the depth function. We then suggest a simple reconstruction algorithm based on the perspective formula, and compare it to existing orthographic SFS algorithms. This simple algorithm obtained lower error rates than legacy SFS algorithms, and equated with and sometimes surpassed state-of-the-art algorithms. These findings lend support to the assumption that transition to a more realistic set of assumptions improves reconstruction significantly.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128082388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised image translation","authors":"Rómer Rosales, Kannan Achan, B. Frey","doi":"10.1109/ICCV.2003.1238384","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238384","url":null,"abstract":"An interesting and potentially useful vision/graphics task is to render an input image in an enhanced form or also in an unusual style; for example with increased sharpness or with some artistic qualities. In previous work [10, 5], researchers showed that by estimating the mapping from an input image to a registered (aligned) image of the same scene in a different style or resolution, the mapping could be used to render a new input image in that style or resolution. Frequently a registered pair is not available, but instead the user may have only a source image of an unrelated scene that contains the desired style. In this case, the task of inferring the output image is much more difficult since the algorithm must both infer correspondences between features in the input image and the source image, and infer the unknown mapping between the images. We describe a Bayesian technique for inferring the most likely output image. The prior on the output image P(X) is a patch-based Markov random field obtained from the source image. The likelihood of the input P(Y/spl bsol/X) is a Bayesian network that can represent different rendering styles. We describe a computationally efficient, probabilistic inference and learning algorithm for inferring the most likely output image and learning the rendering style. We also show that current techniques for image restoration or reconstruction proposed in the vision literature (e.g., image super-resolution or de-noising) and image-based nonphotorealistic rendering could be seen as special cases of our model. We demonstrate our technique using several tasks, including rendering a photograph in the artistic style of an unrelated scene, de-noising, and texture transfer.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133595434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capturing subtle facial motions in 3D face tracking","authors":"Zhen Wen, Thomas S. Huang","doi":"10.1109/ICCV.2003.1238646","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238646","url":null,"abstract":"Facial motions produce not only facial feature points motions, but also subtle appearance changes such as wrinkles and shading changes. These subtle changes are important yet difficult issues for both analysis (tracking) and synthesis (animation). Previous approaches were mostly based on models learned from extensive training appearance examples. However, the space of all possible facial motion appearance is huge. Thus, it is not feasible to collect samples covering all possible variations due to lighting conditions, individualities, and head poses. Therefore, it is difficult to adapt such models to new conditions. In this paper, we present an adaptive technique for analyzing subtle facial appearance changes. We propose a new ratio-image based appearance feature, which is independent of a person's face albedo. This feature is used to track face appearance variations based on exemplars. To adapt the exemplar appearance model to new people and lighting conditions, we develop an online EM-based algorithm. Experiments show that the proposed method improves classification results in a facial expression recognition task, where a variety of people and lighting conditions are involved.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121870334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognizing human action efforts: an adaptive three-mode PCA framework","authors":"James W. Davis, Hui Gao","doi":"10.1109/ICCV.2003.1238662","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238662","url":null,"abstract":"We present a computational framework capable of labeling the effort of an action corresponding to the perceived level of exertion by the performer (low - high). The approach initially factorizes examples (at different efforts) of an action into its three-mode principal components to reduce the dimensionality. Then a learning phase is introduced to compute expressive-feature weights to adjust the model's estimation of effort to conform to given perceptual labels for the examples. Experiments are demonstrated recognizing the efforts of a person carrying bags of different weight and for multiple people walking at different paces.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123561238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}