{"title":"Data-driven road detection","authors":"J. Álvarez, M. Salzmann, N. Barnes","doi":"10.1109/WACV.2014.6835730","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835730","url":null,"abstract":"In this paper, we tackle the problem of road detection from RGB images. In particular, we follow a data-driven approach to segmenting the road pixels in an image. To this end, we introduce two road detection methods: A top-down approach that builds an image-level road prior based on the traffic pattern observed in an input image, and a bottom-up technique that estimates the probability that an image superpixel belongs to the road surface in a nonparametric manner. Both our algorithms work on the principle of label transfer in the sense that the road prior is directly constructed from the ground-truth segmentations of training images. Our experimental evaluation on four different datasets shows that this approach outperforms existing top-down and bottom-up techniques, and is key to the robustness of road detection algorithms to the dataset bias.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"60 1","pages":"1134-1141"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84797067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Determining underwater vehicle movement from sonar data in relatively featureless seafloor tracking missions","authors":"A. Spears, A. Howard, M. West, Thomas Collins","doi":"10.1109/WACV.2014.6836007","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836007","url":null,"abstract":"Navigation through underwater environments is challenging given the lack of accurate positioning systems. The determination of underwater vehicle movement using an integrated acoustic sonar sensor would provide underwater vehicles with greatly increased autonomous navigation capabilities. A forward looking sonar sensor may be used for determining autonomous vehicle movement using filtering and optical flow algorithms. Optical flow algorithms have shown excellent results for vision image processing. However, they have been found difficult to implement using sonar data due to the high level of noise present, as well as the widely varying appearances of objects from frame to frame. For the bottom tracking applications considered, the simplifying assumption can be made that all features move with an equivalent direction and magnitude between frames. Statistical analysis of all estimated feature movements provides an accurate estimate of the overall shift, which translates directly to the vehicle movement. Results using acoustic sonar data are presented which illustrate the effectiveness of this methodology.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"1 1","pages":"909-916"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83030338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lu Wang, V. Kallem, Mayank Bansal, J. Eledath, H. Sawhney, Denise J. Pearson, R. Stone
{"title":"Automatic 3D change detection for glaucoma diagnosis","authors":"Lu Wang, V. Kallem, Mayank Bansal, J. Eledath, H. Sawhney, Denise J. Pearson, R. Stone","doi":"10.1109/WACV.2014.6836072","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836072","url":null,"abstract":"Important diagnostic criteria for glaucoma are changes in the 3D structure of the optic disc due to optic nerve damage. We propose an automatic approach for detecting these changes in 3D models reconstructed from fundus images of the same patient taken at different times. For each time session, only two uncalibated fundus images are required. The approach applies a 6-point algorithm to estimate relative camera pose assuming a constant camera focal length. To deal with the instability of 3D reconstruction associated with fundus images, our approach keeps multiple candidate reconstruction solutions for each image pair. The best 3D reconstruction is found by optimizing the 3D registration of all images after an iterative bundle adjustment that tolerates possible structure changes. The 3D structure changes are detected by evaluating the reprojection errors of feature points in image space. We validate the approach by comparing the diagnosis results with manual grading by human experts on a fundus image dataset.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"55 1","pages":"401-408"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83542530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online discriminative dictionary learning for visual tracking","authors":"Fan Yang, Zhuolin Jiang, L. Davis","doi":"10.1109/WACV.2014.6836014","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836014","url":null,"abstract":"Dictionary learning has been applied to various computer vision problems, such as image restoration, object classification and face recognition. In this work, we propose a tracking framework based on sparse representation and online discriminative dictionary learning. By associating dictionary items with label information, the learned dictionary is both reconstructive and discriminative, which better distinguishes target objects from the background. During tracking, the best target candidate is selected by a joint decision measure. Reliable tracking results and augmented training samples are accumulated into two sets to update the dictionary. Both online dictionary learning and the proposed joint decision measure are important for the final tracking performance. Experiments show that our approach outperforms several recently proposed trackers.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"15 1","pages":"854-861"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75212398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ant tracking with occlusion tunnels","authors":"Thomas Fasciano, A. Dornhaus, M. Shin","doi":"10.1109/WACV.2014.6836002","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836002","url":null,"abstract":"The automated tracking of social insects, such as ants, can efficiently provide unparalleled amounts of data for the of study complex group behaviors. However, a high level of occlusion along with similarity in appearance and motion can cause the tracking to drift to an incorrect ant. In this paper, we reduce drifting by using occlusion to identify incorrect ants and prevent the tracking from drifting to them. The key idea is that a set of ants enter occlusion, move through occlusion then exit occlusion. We do not attempt to track through occlusions but simply find a set of objects that enters and exits them. Knowing that tracking must stay within a set of ants exiting a given occlusion, we reduce drifting by preventing tracking to ants outside the occlusion. Using four 5000 frame video sequences of an ant colony, we demonstrate that the usage of occlusion tunnel reduces the tracking error of (1) drifting to another ant by 30% and (2) early termination of tracking by 7%.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"49 1","pages":"947-952"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77551840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-leaf alignment from fluorescence plant images","authors":"Xi Yin, Xiaoming Liu, Jin Chen, D. Kramer","doi":"10.1109/WACV.2014.6836067","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836067","url":null,"abstract":"In this paper, we propose a multi-leaf alignment framework based on Chamfer matching to study the problem of leaf alignment from fluorescence images of plants, which will provide a leaf-level analysis of photosynthetic activities. Different from the naive procedure of aligning leaves iteratively using the Chamfer distance, the new algorithm aims to find the best alignment of multiple leaves simultaneously in an input image. We formulate an optimization problem of an objective function with three terms: the average of chamfer distances of aligned leaves, the number of leaves, and the difference between the synthesized mask by the leaf candidates and the original image mask. Gradient descent is used to minimize our objective function. A quantitative evaluation framework is also formulated to test the performance of our algorithm. Experimental results show that the proposed multi-leaf alignment optimization performs substantially better than the baseline of the Chamfer matching algorithm in terms of both accuracy and efficiency.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"36 1","pages":"437-444"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76112119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm
{"title":"Combining semantic scene priors and haze removal for single image depth estimation","authors":"Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm","doi":"10.1109/WACV.2014.6836021","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836021","url":null,"abstract":"We consider the problem of estimating the relative depth of a scene from a monocular image. The dark channel prior, used as a statistical observation of haze free images, has been previously leveraged for haze removal and relative depth estimation tasks. However, as a local measure, it fails to account for higher order semantic relationship among scene elements. We propose a dual channel prior used for identifying pixels that are unlikely to comply with the dark channel assumption, leading to erroneous depth estimates. We further leverage semantic segmentation information and patch match label propagation to enforce semantically consistent geometric priors. Experiments illustrate the quantitative and qualitative advantages of our approach when compared to state of the art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"16 1","pages":"800-807"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74584603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond PASCAL: A benchmark for 3D object detection in the wild","authors":"Yu Xiang, Roozbeh Mottaghi, S. Savarese","doi":"10.1109/WACV.2014.6836101","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836101","url":null,"abstract":"3D object detection and pose estimation methods have become popular in recent years since they can handle ambiguities in 2D images and also provide a richer description for objects compared to 2D object detectors. However, most of the datasets for 3D recognition are limited to a small amount of images per category or are captured in controlled environments. In this paper, we contribute PASCAL3D+ dataset, which is a novel and challenging dataset for 3D object detection and pose estimation. PASCAL3D+ augments 12 rigid categories of the PASCAL VOC 2012 [4] with 3D annotations. Furthermore, more images are added for each category from ImageNet [3]. PASCAL3D+ images exhibit much more variability compared to the existing 3D datasets, and on average there are more than 3,000 object instances per category. We believe this dataset will provide a rich testbed to study 3D detection and pose estimation and will help to significantly push forward research in this area. We provide the results of variations of DPM [6] on our new dataset for object detection and viewpoint estimation in different scenarios, which can be used as baselines for the community. Our benchmark is available online at http://cvgl.stanford.edu/projects/pascal3d.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"14 1","pages":"75-82"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80353826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Optimization with an Empirical Hardness Model for approximate Nearest Neighbour Search","authors":"Julieta Martinez, J. Little, Nando de Freitas","doi":"10.1109/WACV.2014.6836049","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836049","url":null,"abstract":"Nearest Neighbour Search in high-dimensional spaces is a common problem in Computer Vision. Although no algorithm better than linear search is known, approximate algorithms are commonly used to tackle this problem. The drawback of using such algorithms is that their performance depends highly on parameter tuning. While this process can be automated using standard empirical optimization techniques, tuning is still time-consuming. In this paper, we propose to use Empirical Hardness Models to reduce the number of parameter configurations that Bayesian Optimization has to try, speeding up the optimization process. Evaluation on standard benchmarks of SIFT and GIST descriptors shows the viability of our approach.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"77 1","pages":"588-595"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80764441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lp-norm MTMKL framework for simultaneous detection of multiple facial action units","authors":"Xiao Zhang, M. Mahoor, S. Mavadati, J. Cohn","doi":"10.1109/WACV.2014.6835735","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835735","url":null,"abstract":"Facial action unit (AU) detection is a challenging topic in computer vision and pattern recognition. Most existing approaches design classifiers to detect AUs individually or AU combinations without considering the intrinsic relations among AUs. This paper presents a novel method, lp-norm multi-task multiple kernel learning (MTMKL), that jointly learns the classifiers for detecting the absence and presence of multiple AUs. lp-norm MTMKL is an extension of the regularized multi-task learning, which learns shared kernels from a given set of base kernels among all the tasks within Support Vector Machines (SVM). Our approach has several advantages over existing methods: (1) AU detection work is transformed to a MTL problem, where given a specific frame, multiple AUs are detected simultaneously by exploiting their inter-relations; (2) lp-norm multiple kernel learning is applied to increase the discriminant power of classifiers. Our experimental results on the CK+ and DISFA databases show that the proposed method outperforms the state-of-the-art methods for AU detection.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"17 1","pages":"1104-1111"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73480128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}