{"title":"Likelihood Map Fusion for Visual Object Tracking","authors":"Zhaozheng Yin, F. Porikli, R. Collins","doi":"10.1109/WACV.2008.4544036","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544036","url":null,"abstract":"Visual object tracking can be considered as a figure-ground classification task. In this paper, different features are used to generate a set of likelihood maps for each pixel indicating the probability of that pixel belonging to foreground object or scene background. For example, intensity, texture, motion, saliency and template matching can all be used to generate likelihood maps. We propose a generic likelihood map fusion framework to combine these heterogeneous features into a fused soft segmentation suitable for mean-shift tracking. All the component likelihood maps contribute to the segmentation based on their classification confidence scores (weights) learned from the previous frame. The evidence combination framework dynamically updates the weights such that, in the fused likelihood map, discriminative foreground/background information is preserved while ambiguous information is suppressed. The framework is applied here to track ground vehicles from thermal airborne video, and is also compared to other state-of-the-art algorithms.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121725322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ioana Fleming, S. Voros, Balázs P. Vágvölgyi, Zachary A. Pezzementi, J. Handa, R. Taylor, Gregory Hager
{"title":"Intraoperative Visualization of Anatomical Targets in Retinal Surgery","authors":"Ioana Fleming, S. Voros, Balázs P. Vágvölgyi, Zachary A. Pezzementi, J. Handa, R. Taylor, Gregory Hager","doi":"10.1109/WACV.2008.4544034","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544034","url":null,"abstract":"Certain surgical procedures require a high degree of precise manual control within a very restricted area. Retinal surgeries are part of this group of procedures. During vitreoretinal surgery, the surgeon must visualize, using a microscope, an area spanning a few hundreds of microns in diameter and manually correct the potential pathology using direct contact, free hand techniques. In addition, the surgeon must find an effective compromise between magnification, depth perception, field of view, and clarity of view. Pre-operative images are used to locate interventional targets, and also to assess and plan the surgical procedure. This paper proposes a method of fusing information contained in pre-operative imagery, such as fundus and OCT images, with intra-operative video to increase accuracy in finding the target areas. We describe methods for maintaining, in real-time, registration with anatomical features and target areas using image processing. This registration allows us to produce information enhanced displays that ensure that the retinal surgeon is always in visual contact with his/her area of interest.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114810702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Color Enhancement in Image Fusion","authors":"Chao Zhang, Azhar A. Sufi","doi":"10.1109/WACV.2008.4543994","DOIUrl":"https://doi.org/10.1109/WACV.2008.4543994","url":null,"abstract":"We propose an innovative approach to enhance the color in image fusion. Previous work in color fusion has focused on the pseudo-color fusion. The color components are directly related to or linearly combined with the luminance images in different modalities. However, no work has been done to study the effects on the chrominance in the fused image when the luminance is significantly changed. For example, when objects in the visible image are dark but bright in the IR image, the fused objects become brighter. The color in the fused image looks pale. If the IR signal is strong enough such that the fused objects become saturated, the color is washed out. To solve these problems, we propose a gamma correction function to the normalized color components. The gamma function is a function of the difference between the fused result and the luminance of the visible image. In order to recover the chrominance in saturated region, we designed a damping function to drag the pixels in the saturated region closer to the visible image. We also discuss the orthogonalization of color bands in the correlated color space. The orthogonalization helps reduce the color skew during color enhancement.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130373434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Human Pose Recognition Using Unlabelled Markers","authors":"Yi Wang, G. Qian","doi":"10.1109/WACV.2008.4544037","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544037","url":null,"abstract":"In this paper, we tackle robust human pose recognition using unlabelled markers obtained from an optical marker-based motion capture system. A coarse-to-fine fast pose matching algorithm is presented with the following three steps. Given a query pose, firstly, the majority of the non-matching poses are rejected according to marker distributions along the radius and height dimensions. Secondly, relative rotation angles between the query pose and the remaining candidate poses are estimated using a fast histogram matching method based on circular convolution implemented using the fast Fourier transform. Finally, rotation angle estimates are refined using nonlinear least square minimization through the Levenberg-Marquardt minimization. In the presence of multiple solutions, false poses can be effectively removed by thresholding the minimized matching scores. The proposed framework can handle missing markers caused by occlusion. Experimental results using real motion capture data show the efficacy of the proposed approach.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126217036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brian Clipp, Jae-Hak Kim, Jan-Michael Frahm, M. Pollefeys, R. Hartley
{"title":"Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems","authors":"Brian Clipp, Jae-Hak Kim, Jan-Michael Frahm, M. Pollefeys, R. Hartley","doi":"10.1109/WACV.2008.4544011","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544011","url":null,"abstract":"This paper introduces a novel, robust approach for 6DOF motion estimation of a multi-camera system with non-overlapping views. The proposed approach is able to solve the pose estimation, including scale, for a two camera system with non-overlapping views. In contrast to previous approaches, it degrades gracefully if the motion is close to degenerate. For degenerate motions the technique estimates the remaining 5DOF. The proposed technique is evaluated on real and synthetic sequences.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117232831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cascading Trilinear Tensors for Face Authentication","authors":"G. Wagner, E. Sinzinger","doi":"10.1109/WACV.2008.4544029","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544029","url":null,"abstract":"This paper presents a method to improve the accuracy rates of face authentication between images with different poses. Trilinear tensors are used to adjust the pose of the training and testing images. All the images are transformed by a pose adjustment algorithm so novel images are generated that have the same pose. These novel images are then used to train and test support vector machine (SVM) face authentication functions to verify the identity of the people in the images. The overall results show that the accuracy improves when the poses of the images are adjusted.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124455426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"INSPEC2T: Inexpensive Spectrometer Color Camera Technology","authors":"W. Scheirer, S. Kirkbride, T. Boult","doi":"10.1109/WACV.2008.4543999","DOIUrl":"https://doi.org/10.1109/WACV.2008.4543999","url":null,"abstract":"Modern spectrometer equipment tends to be expensive, thus increasing the cost of emerging systems that take advantage of spectral properties as part of their operation. This paper introduces a novel technique that exploits the spectral response characteristics of a traditional sensor (i.e. CMOS or CCD) to utilize it as a low-cost spectrometer. Using the raw Bayer pattern data from a sensor, we estimate the brightness and wavelength of the measured light at a particular point. We use this information to support wide dynamic range, high noise tolerance, and, if sampling takes place on a slope, sub-pixel resolution. Experimental results are provided for both simulation and real data. Further, we investigate the potential of this low-cost technology for spoof detection in biometric systems. Lastly, an actual hardware systhesis is conducted to show the ease with which this algorithm can be implemented onto an FPGA.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127315407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-Driven Visual Sensor Networks: Issues in Reliability","authors":"Alexandra Czarlinska, D. Kundur","doi":"10.1109/WACV.2008.4544039","DOIUrl":"https://doi.org/10.1109/WACV.2008.4544039","url":null,"abstract":"Event-driven visual sensor networks (VSNs) rely on a combination of camera nodes and scalar sensors to determine if a frame contains an event of interest that should be transmitted to the cluster head. The appeal of event-driven VSNs stems from the possibility of eliminating non-relevant frames at the source thus implicitly minimizing the amount of energy required for coding and transmission. The challenges of the event-driven paradigm result from the vulnerability of scalar sensors to attack or error and from the lightweight image processing available to the camera nodes due to resource constraints. In this work we focus on the reliability issues of VSNs in the case of global actuation attacks on the scalar sensors. We study the extent to which various utility functions enable an attacker to increase the average expected number of affected nodes with a relatively small penalty in the loss of stealth. We then discuss tradeoffs between different attack detection strategies in terms of the cost of processing and the required information at the cluster head and nodes.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126478794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interactive Portrait Art","authors":"Ye Ning, T. Sim","doi":"10.1109/WACV.2008.4543998","DOIUrl":"https://doi.org/10.1109/WACV.2008.4543998","url":null,"abstract":"Traditionally, enjoying a portrait art, e.g. the Mona Lisa, is a passive activity. The spectator merely views the painting and admires the brush strokes, composition, etc. But now, with real-time computer vision and graphics algorithms, we can inject interactivity into portrait art, thereby bringing these art works back to life and giving a new dimension to art enjoyment. Specifically, in our art installation, a spectator is allowed to animate the face in a portrait art work to produce any expression she/he likes. The system consists of one personal computer and one camera. Given one frontal portrait picture as input, it generates an animation-ready avatar with minimum user intervention. The spectator then can perform before the camera any expression. And the system will capture the facial motion of the spectator and retarget the motion to the avatar. The animation of the avatar is rendered back to the original portrait picture. The motion retargetting is done in real-time.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128198151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Categorization Based on Kernel Principal Component Analysis of Visual Words","authors":"K. Hotta","doi":"10.1109/WACV.2008.4543993","DOIUrl":"https://doi.org/10.1109/WACV.2008.4543993","url":null,"abstract":"Many researchers are studying object categorization problem. It is reported that bag of keypoints approach which is based on local features without topological information is effective for object categorization. Conventional bag of keypoints approach selects the visual words by clustering and uses the similarity with each visual word as the features for classification. In this paper, we model the ensemble of visual words, and the similarities with ensemble of visual words not each visual word are used for classification. Kernel principal component analysis (KPCA) is used to model them and extract the information specialized for each category. The projection length in subspace is used as features for support vector machine (SVM). There are two reasons why we use KPCA to model the ensemble of visual words. The first reason is to model the non-linear variations induced by various kinds of visual words. The second reason is that KPCA of local features is robust to pose variations. The proposed method is evaluated using Caltech 101 database. We confirm that the proposed method is comparable with the state of the art methods without absolute position information.","PeriodicalId":439571,"journal":{"name":"2008 IEEE Workshop on Applications of Computer Vision","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}