{"title":"Online discriminative dictionary learning for visual tracking","authors":"Fan Yang, Zhuolin Jiang, L. Davis","doi":"10.1109/WACV.2014.6836014","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836014","url":null,"abstract":"Dictionary learning has been applied to various computer vision problems, such as image restoration, object classification and face recognition. In this work, we propose a tracking framework based on sparse representation and online discriminative dictionary learning. By associating dictionary items with label information, the learned dictionary is both reconstructive and discriminative, which better distinguishes target objects from the background. During tracking, the best target candidate is selected by a joint decision measure. Reliable tracking results and augmented training samples are accumulated into two sets to update the dictionary. Both online dictionary learning and the proposed joint decision measure are important for the final tracking performance. Experiments show that our approach outperforms several recently proposed trackers.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"15 1","pages":"854-861"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75212398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ant tracking with occlusion tunnels","authors":"Thomas Fasciano, A. Dornhaus, M. Shin","doi":"10.1109/WACV.2014.6836002","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836002","url":null,"abstract":"The automated tracking of social insects, such as ants, can efficiently provide unparalleled amounts of data for the of study complex group behaviors. However, a high level of occlusion along with similarity in appearance and motion can cause the tracking to drift to an incorrect ant. In this paper, we reduce drifting by using occlusion to identify incorrect ants and prevent the tracking from drifting to them. The key idea is that a set of ants enter occlusion, move through occlusion then exit occlusion. We do not attempt to track through occlusions but simply find a set of objects that enters and exits them. Knowing that tracking must stay within a set of ants exiting a given occlusion, we reduce drifting by preventing tracking to ants outside the occlusion. Using four 5000 frame video sequences of an ant colony, we demonstrate that the usage of occlusion tunnel reduces the tracking error of (1) drifting to another ant by 30% and (2) early termination of tracking by 7%.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"49 1","pages":"947-952"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77551840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-leaf alignment from fluorescence plant images","authors":"Xi Yin, Xiaoming Liu, Jin Chen, D. Kramer","doi":"10.1109/WACV.2014.6836067","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836067","url":null,"abstract":"In this paper, we propose a multi-leaf alignment framework based on Chamfer matching to study the problem of leaf alignment from fluorescence images of plants, which will provide a leaf-level analysis of photosynthetic activities. Different from the naive procedure of aligning leaves iteratively using the Chamfer distance, the new algorithm aims to find the best alignment of multiple leaves simultaneously in an input image. We formulate an optimization problem of an objective function with three terms: the average of chamfer distances of aligned leaves, the number of leaves, and the difference between the synthesized mask by the leaf candidates and the original image mask. Gradient descent is used to minimize our objective function. A quantitative evaluation framework is also formulated to test the performance of our algorithm. Experimental results show that the proposed multi-leaf alignment optimization performs substantially better than the baseline of the Chamfer matching algorithm in terms of both accuracy and efficiency.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"36 1","pages":"437-444"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76112119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm
{"title":"Combining semantic scene priors and haze removal for single image depth estimation","authors":"Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm","doi":"10.1109/WACV.2014.6836021","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836021","url":null,"abstract":"We consider the problem of estimating the relative depth of a scene from a monocular image. The dark channel prior, used as a statistical observation of haze free images, has been previously leveraged for haze removal and relative depth estimation tasks. However, as a local measure, it fails to account for higher order semantic relationship among scene elements. We propose a dual channel prior used for identifying pixels that are unlikely to comply with the dark channel assumption, leading to erroneous depth estimates. We further leverage semantic segmentation information and patch match label propagation to enforce semantically consistent geometric priors. Experiments illustrate the quantitative and qualitative advantages of our approach when compared to state of the art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"16 1","pages":"800-807"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74584603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond PASCAL: A benchmark for 3D object detection in the wild","authors":"Yu Xiang, Roozbeh Mottaghi, S. Savarese","doi":"10.1109/WACV.2014.6836101","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836101","url":null,"abstract":"3D object detection and pose estimation methods have become popular in recent years since they can handle ambiguities in 2D images and also provide a richer description for objects compared to 2D object detectors. However, most of the datasets for 3D recognition are limited to a small amount of images per category or are captured in controlled environments. In this paper, we contribute PASCAL3D+ dataset, which is a novel and challenging dataset for 3D object detection and pose estimation. PASCAL3D+ augments 12 rigid categories of the PASCAL VOC 2012 [4] with 3D annotations. Furthermore, more images are added for each category from ImageNet [3]. PASCAL3D+ images exhibit much more variability compared to the existing 3D datasets, and on average there are more than 3,000 object instances per category. We believe this dataset will provide a rich testbed to study 3D detection and pose estimation and will help to significantly push forward research in this area. We provide the results of variations of DPM [6] on our new dataset for object detection and viewpoint estimation in different scenarios, which can be used as baselines for the community. Our benchmark is available online at http://cvgl.stanford.edu/projects/pascal3d.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"14 1","pages":"75-82"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80353826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Optimization with an Empirical Hardness Model for approximate Nearest Neighbour Search","authors":"Julieta Martinez, J. Little, Nando de Freitas","doi":"10.1109/WACV.2014.6836049","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836049","url":null,"abstract":"Nearest Neighbour Search in high-dimensional spaces is a common problem in Computer Vision. Although no algorithm better than linear search is known, approximate algorithms are commonly used to tackle this problem. The drawback of using such algorithms is that their performance depends highly on parameter tuning. While this process can be automated using standard empirical optimization techniques, tuning is still time-consuming. In this paper, we propose to use Empirical Hardness Models to reduce the number of parameter configurations that Bayesian Optimization has to try, speeding up the optimization process. Evaluation on standard benchmarks of SIFT and GIST descriptors shows the viability of our approach.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"77 1","pages":"588-595"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80764441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lp-norm MTMKL framework for simultaneous detection of multiple facial action units","authors":"Xiao Zhang, M. Mahoor, S. Mavadati, J. Cohn","doi":"10.1109/WACV.2014.6835735","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835735","url":null,"abstract":"Facial action unit (AU) detection is a challenging topic in computer vision and pattern recognition. Most existing approaches design classifiers to detect AUs individually or AU combinations without considering the intrinsic relations among AUs. This paper presents a novel method, lp-norm multi-task multiple kernel learning (MTMKL), that jointly learns the classifiers for detecting the absence and presence of multiple AUs. lp-norm MTMKL is an extension of the regularized multi-task learning, which learns shared kernels from a given set of base kernels among all the tasks within Support Vector Machines (SVM). Our approach has several advantages over existing methods: (1) AU detection work is transformed to a MTL problem, where given a specific frame, multiple AUs are detected simultaneously by exploiting their inter-relations; (2) lp-norm multiple kernel learning is applied to increase the discriminant power of classifiers. Our experimental results on the CK+ and DISFA databases show that the proposed method outperforms the state-of-the-art methods for AU detection.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"17 1","pages":"1104-1111"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73480128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coupling video segmentation and action recognition","authors":"Amir Ghodrati, M. Pedersoli, T. Tuytelaars","doi":"10.1109/WACV.2014.6836045","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836045","url":null,"abstract":"Recently a lot of progress has been made in the field of video segmentation. The question then arises whether and how these results can be exploited for this other video processing challenge, action recognition. In this paper we show that a good segmentation is actually very important for recognition. We propose and evaluate several ways to integrate and combine the two tasks: i) recognition using a standard, bottom-up segmentation, ii) using a top-down segmentation geared towards actions, iii) using a segmentation based on inter-video similarities (co-segmentation), and iv) tight integration of recognition and segmentation via iterative learning. Our results clearly show that, on the one hand, the two tasks are interdependent and therefore an iterative optimization of the two makes sense and gives better results. On the other hand, comparable results can also be obtained with two separate steps but mapping the feature-space with a non-linear kernel.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"35 1","pages":"618-625"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72753658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finger-knuckle-print verification based on vector consistency of corresponding interest points","authors":"Min-Ki Kim, P. Flynn","doi":"10.1109/WACV.2014.6835996","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835996","url":null,"abstract":"This paper proposes a novel finger-knuckle-print (FKP) verification method based on vector consistency among corresponding interest points (CIPs) detected from aligned finger images. We used two different approaches for reliable detection of CIPs; one method employs SIFT features and captures gradient directionality, and the other method employs phase correlation to represent the intensity field surrounding an interest point. The consistency of interframe displacements between pairs of matching CIPs in a match pair is used as a matching score. Such displacements will show consistency in a genuine match but not in an impostor match. Experimental results show that the proposed approach is effective in FKP verification.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"15 1","pages":"992-997"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76430927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear Local Distance coding for classification of HEp-2 staining patterns","authors":"Xiang Xu, F. Lin, Carol Ng, K. Leong","doi":"10.1109/WACV.2014.6836073","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836073","url":null,"abstract":"Indirect Immunofluorescence (IIF) on Human Epithelial-2 (HEp-2) cells is the recommended methodology for detecting some specific autoimmune diseases by searching for antinuclear antibodies (ANAs) within a patient's serum. Due to the limitations of IIF such as subjective evaluation, automated Computer-Aided Diagnosis (CAD) system is required for diagnostic purposes. In particular, staining patterns classification of HEp-2 cells is a challenging task. In this paper, we adopt a feature extraction-coding-pooling framework which has shown impressive performance in image classification tasks, because it can obtain discriminative and effective image representation. However, the information loss is inevitable in the coding process. Therefore, we propose a Linear Local Distance (LLD) coding method to capture more discriminative information. LLD transforms original local feature to local distance vector by searching for local nearest few neighbors of local feature in the class-specific manifolds. The obtained local distance vector is further encoded and pooled together to get salient image representation. We demonstrate the effectiveness of LLD method on a public HEp-2 cells dataset containing six major staining patterns. Experimental results show that our approach has a superior performance to the state-of-the-art coding methods for staining patterns classification of HEp-2 cells.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"8 1","pages":"393-400"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81445168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}