{"title":"Coupling video segmentation and action recognition","authors":"Amir Ghodrati, M. Pedersoli, T. Tuytelaars","doi":"10.1109/WACV.2014.6836045","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836045","url":null,"abstract":"Recently a lot of progress has been made in the field of video segmentation. The question then arises whether and how these results can be exploited for this other video processing challenge, action recognition. In this paper we show that a good segmentation is actually very important for recognition. We propose and evaluate several ways to integrate and combine the two tasks: i) recognition using a standard, bottom-up segmentation, ii) using a top-down segmentation geared towards actions, iii) using a segmentation based on inter-video similarities (co-segmentation), and iv) tight integration of recognition and segmentation via iterative learning. Our results clearly show that, on the one hand, the two tasks are interdependent and therefore an iterative optimization of the two makes sense and gives better results. On the other hand, comparable results can also be obtained with two separate steps but mapping the feature-space with a non-linear kernel.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"35 1","pages":"618-625"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72753658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finger-knuckle-print verification based on vector consistency of corresponding interest points","authors":"Min-Ki Kim, P. Flynn","doi":"10.1109/WACV.2014.6835996","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835996","url":null,"abstract":"This paper proposes a novel finger-knuckle-print (FKP) verification method based on vector consistency among corresponding interest points (CIPs) detected from aligned finger images. We used two different approaches for reliable detection of CIPs; one method employs SIFT features and captures gradient directionality, and the other method employs phase correlation to represent the intensity field surrounding an interest point. The consistency of interframe displacements between pairs of matching CIPs in a match pair is used as a matching score. Such displacements will show consistency in a genuine match but not in an impostor match. Experimental results show that the proposed approach is effective in FKP verification.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"15 1","pages":"992-997"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76430927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear Local Distance coding for classification of HEp-2 staining patterns","authors":"Xiang Xu, F. Lin, Carol Ng, K. Leong","doi":"10.1109/WACV.2014.6836073","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836073","url":null,"abstract":"Indirect Immunofluorescence (IIF) on Human Epithelial-2 (HEp-2) cells is the recommended methodology for detecting some specific autoimmune diseases by searching for antinuclear antibodies (ANAs) within a patient's serum. Due to the limitations of IIF such as subjective evaluation, automated Computer-Aided Diagnosis (CAD) system is required for diagnostic purposes. In particular, staining patterns classification of HEp-2 cells is a challenging task. In this paper, we adopt a feature extraction-coding-pooling framework which has shown impressive performance in image classification tasks, because it can obtain discriminative and effective image representation. However, the information loss is inevitable in the coding process. Therefore, we propose a Linear Local Distance (LLD) coding method to capture more discriminative information. LLD transforms original local feature to local distance vector by searching for local nearest few neighbors of local feature in the class-specific manifolds. The obtained local distance vector is further encoded and pooled together to get salient image representation. We demonstrate the effectiveness of LLD method on a public HEp-2 cells dataset containing six major staining patterns. Experimental results show that our approach has a superior performance to the state-of-the-art coding methods for staining patterns classification of HEp-2 cells.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"8 1","pages":"393-400"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81445168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structure-aware keypoint tracking for partial occlusion handling","authors":"W. Bouachir, Guillaume-Alexandre Bilodeau","doi":"10.1109/WACV.2014.6836011","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836011","url":null,"abstract":"This paper introduces a novel keypoint-based method for visual object tracking. To represent the target, we use a new model combining color distribution with keypoints. The appearance model also incorporates the spatial layout of the keypoints, encoding the object structure learned during tracking. With this multi-feature appearance model, our Structure-Aware Tracker (SAT) estimates accurately the target location using three main steps. First, the search space is reduced to the most likely image regions with a probabilistic approach. Second, the target location is estimated in the reduced search space using deterministic keypoint matching. Finally, the location prediction is corrected by exploiting the keypoint structural model with a voting-based method. By applying our SAT on several tracking problems, we show that location correction based on structural constraints is a key technique to improve prediction in moderately crowded scenes, even if only a small part of the target is visible. We also conduct comparison with a number of state-of-the-art trackers and demonstrate the competitiveness of the proposed method.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"28 1","pages":"877-884"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90531344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-view action recognition one camera at a time","authors":"Scott Spurlock, Richard Souvenir","doi":"10.1109/WACV.2014.6836047","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836047","url":null,"abstract":"For human action recognition methods, there is often a trade-off between classification accuracy and computational efficiency. Methods that include 3D information from multiple cameras are often computationally expensive and not suitable for real-time application. 2D, frame-based methods are generally more efficient, but suffer from lower recognition accuracies. In this paper, we present a hybrid keypose-based method that operates in a multi-camera environment, but uses only a single camera at a time. We learn, for each keypose, the relative utility of a particular viewpoint compared with switching to a different available camera in the network for future classification. On a benchmark multi-camera action recognition dataset, our method outperforms approaches that incorporate all available cameras.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"91 1","pages":"604-609"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79517106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interactively test driving an object detector: Estimating performance on unlabeled data","authors":"Rushil Anirudh, P. Turaga","doi":"10.1109/WACV.2014.6836104","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836104","url":null,"abstract":"In this paper, we study the problem of `test-driving' a detector, i.e. allowing a human user to get a quick sense of how well the detector generalizes to their specific requirement. To this end, we present the first system that estimates detector performance interactively without extensive ground truthing using a human in the loop. We approach this as a problem of estimating proportions and show that it is possible to make accurate inferences on the proportion of classes or groups within a large data collection by observing only 5 - 10% of samples from the data. In estimating the false detections (for precision), the samples are chosen carefully such that the overall characteristics of the data collection are preserved. Next, inspired by its use in estimating disease propagation we apply pooled testing approaches to estimate missed detections (for recall) from the dataset. The estimates thus obtained are close to the ones obtained using ground truth, thus reducing the need for extensive labeling which is expensive and time consuming.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"91 1","pages":"175-182"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87849863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of face detection and image classification for detecting front seat passengers in vehicles","authors":"Y. Artan, P. Paul, F. Perronnin, A. Burry","doi":"10.1109/WACV.2014.6835994","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835994","url":null,"abstract":"Due to the high volume of traffic on modern roadways, transportation agencies have proposed High Occupancy Vehicle (HOV) lanes and High Occupancy Tolling (HOT) lanes to promote car pooling. However, enforcement of the rules of these lanes is currently performed by roadside enforcement officers using visual observation. Manual roadside enforcement is known to be inefficient, costly, potentially dangerous, and ultimately ineffective. Violation rates up to 50%-80% have been reported, while manual enforcement rates of less than 10% are typical. Therefore, there is a need for automated vehicle occupancy detection to support HOV/HOT lane enforcement. A key component of determining vehicle occupancy is to determine whether or not the vehicle's front passenger seat is occupied. In this paper, we examine two methods of determining vehicle front seat occupancy using a near infrared (NIR) camera system pointed at the vehicle's front windshield. The first method examines a state-of-the-art deformable part model (DPM) based face detection system that is robust to facial pose. The second method examines state-of-the-art local aggregation based image classification using bag-of-visual-words (BOW) and Fisher vectors (FV). A dataset of 3000 images was collected on a public roadway and is used to perform the comparison. From these experiments it is clear that the image classification approach is superior for this problem.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"1 1","pages":"1006-1012"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74341594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weichao Qiu, Xinggang Wang, X. Bai, A. Yuille, Z. Tu
{"title":"Scale-Space SIFT flow","authors":"Weichao Qiu, Xinggang Wang, X. Bai, A. Yuille, Z. Tu","doi":"10.1109/WACV.2014.6835734","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835734","url":null,"abstract":"The state-of-the-art SIFT flow has been widely adopted for the general image matching task, especially in dealing with image pairs from similar scenes but with different object configurations. However, the way in which the dense SIFT features are computed at a fixed scale in the SIFT flow method limits its capability of dealing with scenes of large scale changes. In this paper, we propose a simple, intuitive, and very effective approach, Scale-Space SIFT flow, to deal with the large scale differences in different image locations. We introduce a scale field to the SIFT flow function to automatically explore the scale deformations. Our approach achieves similar performance as the SIFT flow method on general natural scenes but obtains significant improvement on the images with large scale differences. Compared with a recent method that addresses the similar problem, our approach shows its clear advantage being more effective, and significantly less demanding in memory and time requirement.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"4 1","pages":"1112-1119"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75073359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending explicit shape regression with mixed feature channels and pose priors","authors":"Matthias Richter, Hua Gao, H. K. Ekenel","doi":"10.1109/WACV.2014.6835993","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835993","url":null,"abstract":"Facial feature detection offers a wide range of applications, e.g. in facial image processing, human computer interaction, consumer electronics, and the entertainment industry. These applications impose two antagonistic key requirements: high processing speed and high detection accuracy. We address both by expanding upon the recently proposed explicit shape regression [1] to (a) allow usage and mixture of different feature channels, and (b) include head pose information to improve detection performance in non-cooperative environments. Using the publicly available “wild” datasets LFW [10] and AFLW [11], we show that using these extensions outperforms the baseline (up to 10% gain in accuracy at 8% IOD) as well as other state-of-the-art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"71 7","pages":"1013-1019"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/WACV.2014.6835993","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72454340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering discriminative cell attributes for HEp-2 specimen image classification","authors":"A. Wiliem, Peter Hobson, B. Lovell","doi":"10.1109/WACV.2014.6836071","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836071","url":null,"abstract":"Recently, there has been a growing interest in developing Computer Aided Diagnostic (CAD) systems for improving the reliability and consistency of pathology test results. This paper describes a novel CAD system for the Anti-Nuclear Antibody (ANA) test via Indirect Immunofluorescence protocol on Human Epithelial Type 2 (HEp-2) cells. While prior works have primarily focused on classifying cell images extracted from ANA specimen images, this work takes a further step by focussing on the specimen image classification problem itself. Our system is able to efficiently classify specimen images as well as producing meaningful descriptions of ANA pattern class which helps physicians to understand the differences between various ANA patterns. We achieve this goal by designing a specimen-level image descriptor that: (1) is highly discriminative; (2) has small descriptor length and (3) is semantically meaningful at the cell level. In our work, a specimen image descriptor is represented by its overall cell attribute descriptors. As such, we propose two max-margin based learning schemes to discover cell attributes whilst still maintaining the discrimination of the specimen image descriptor. Our learning schemes differ from the existing discriminative attribute learning approaches as they primarily focus on discovering image-level attributes. Comparative evaluations were undertaken to contrast the proposed approach to various state-of-the-art approaches on a novel HEp-2 cell dataset which was specifically proposed for the specimen-level classification. Finally, we showcase the ability of the proposed approach to provide textual descriptions to explain ANA patterns.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"11 1","pages":"423-430"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78498664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}