CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995491
William Brendel, Alan Fern, S. Todorovic
{"title":"Probabilistic event logic for interval-based event recognition","authors":"William Brendel, Alan Fern, S. Todorovic","doi":"10.1109/CVPR.2011.5995491","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995491","url":null,"abstract":"This paper is about detecting and segmenting interrelated events which occur in challenging videos with motion blur, occlusions, dynamic backgrounds, and missing observations. We argue that holistic reasoning about time intervals of events, and their temporal constraints is critical in such domains to overcome the noise inherent to low-level video representations. For this purpose, our first contribution is the formulation of probabilistic event logic (PEL) for representing temporal constraints among events. A PEL knowledge base consists of confidence-weighted formulas from a temporal event logic, and specifies a joint distribution over the occurrence time intervals of all events. Our second contribution is a MAP inference algorithm for PEL that addresses the scalability issue of reasoning about an enormous number of time intervals and their constraints in a typical video. Specifically, our algorithm leverages the spanning-interval data structure for compactly representing and manipulating entire sets of time intervals without enumerating them. Our experiments on interpreting basketball videos show that PEL inference is able to jointly detect events and identify their time intervals, based on noisy input from primitive-event detectors.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"23 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115525903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995425
Mu Li, Xiao-Chen Lian, J. Kwok, Bao-Liang Lu
{"title":"Time and space efficient spectral clustering via column sampling","authors":"Mu Li, Xiao-Chen Lian, J. Kwok, Bao-Liang Lu","doi":"10.1109/CVPR.2011.5995425","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995425","url":null,"abstract":"Spectral clustering is an elegant and powerful approach for clustering. However, the underlying eigen-decomposition takes cubic time and quadratic space w.r.t. the data set size. These can be reduced by the Nyström method which samples only a subset of columns from the matrix. However, the manipulation and storage of these sampled columns can still be expensive when the data set is large. In this paper, we propose a time- and space-efficient spectral clustering algorithm which can scale to very large data sets. A general procedure to orthogonalize the approximated eigenvectors is also proposed. Extensive spectral clustering experiments on a number of data sets, ranging in size from a few thousands to several millions, demonstrate the accuracy and scalability of the proposed approach. We further apply it to the task of image segmentation. For images with more than 10 millions pixels, this algorithm can obtain the eigenvectors in 1 minute on a single machine.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124102354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995710
Marc'Aurelio Ranzato, J. Susskind, Volodymyr Mnih, Geoffrey E. Hinton
{"title":"On deep generative models with applications to recognition","authors":"Marc'Aurelio Ranzato, J. Susskind, Volodymyr Mnih, Geoffrey E. Hinton","doi":"10.1109/CVPR.2011.5995710","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995710","url":null,"abstract":"The most popular way to use probabilistic models in vision is first to extract some descriptors of small image patches or object parts using well-engineered features, and then to use statistical learning tools to model the dependencies among these features and eventual labels. Learning probabilistic models directly on the raw pixel values has proved to be much more difficult and is typically only used for regularizing discriminative methods. In this work, we use one of the best, pixel-level, generative models of natural images–a gated MRF–as the lowest level of a deep belief network (DBN) that has several hidden layers. We show that the resulting DBN is very good at coping with occlusion when predicting expression categories from face images, and it can produce features that perform comparably to SIFT descriptors for discriminating different types of scene. The generative ability of the model also makes it easy to see what information is captured and what is lost at each level of representation.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114386570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unified framework for locating and recognizing human actions","authors":"Yuelei Xie, Hong Chang, Zhe Li, Luhong Liang, Xilin Chen, Debin Zhao","doi":"10.1109/CVPR.2011.5995648","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995648","url":null,"abstract":"In this paper, we present a pose based approach for locating and recognizing human actions in videos. In our method, human poses are detected and represented based on deformable part model. To our knowledge, this is the first work on exploring the effectiveness of deformable part models in combining human detection and pose estimation into action recognition. Comparing with previous methods, ours have three main advantages. First, our method does not rely on any assumption on video preprocessing quality, such as satisfactory foreground segmentation or reliable tracking; Second, we propose a novel compact representation for human pose which works together with human detection and can well represent the spatial and temporal structures inside an action; Third, with human detection taken into consideration in our framework, our method has the ability to locate and recognize multiple actions in the same scene. Experiments on benchmark datasets and recorded cluttered videos verified the efficacy of our method.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116932838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995497
Xiaobai Liu, Jiashi Feng, Shuicheng Yan, Liang Lin, Hai Jin
{"title":"Segment an image by looking into an image corpus","authors":"Xiaobai Liu, Jiashi Feng, Shuicheng Yan, Liang Lin, Hai Jin","doi":"10.1109/CVPR.2011.5995497","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995497","url":null,"abstract":"This paper investigates how to segment an image into semantic regions by harnessing an unlabeled image corpus. First, the image segmentation task is recast as a small-size patch grouping problem. Then, we discover two novel patch-pair priors, namely the first-order patch-pair density prior and the second-order patch-pair co-occurrence prior, founded on two statistical observations from the natural image corpus. The underlying rationalities are: 1) a patch-pair falling within the same object region generally has higher density than a patch-pair falling on different objects, and 2) two patch-pairs with high co-occurrence frequency are likely to bear similar semantic consistence confidences (SCCs), i.e. the confidence of the consisted two patches belonging to the same semantic concept. These two discriminative priors are further integrated into a unified objective function in order to augment the intrinsic patch-pair similarities, originally calculated using patch-level visual features, into the semantic consistence confidences. Nonnegative constraint is also imposed over the output variables and an efficient iterative procedure is provided to seek the optimal solution. The ultimate patch grouping is conducted by first building a similarity graph, which takes the atomic patches as vertices and the augmented patch-pair SCCs as edge weights, and then employing the popular Normalized Cut approach to group patches into semantic clusters. Extensive image segmentation experiments on two public databases clearly demonstrate the superiority of the proposed approach over various state-of-the-arts unsupervised image segmentation algorithms.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116952942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995405
Tianfan Xue, Jianzhuang Liu, Xiaoou Tang
{"title":"Symmetric piecewise planar object reconstruction from a single image","authors":"Tianfan Xue, Jianzhuang Liu, Xiaoou Tang","doi":"10.1109/CVPR.2011.5995405","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995405","url":null,"abstract":"Recovering 3D geometry from a single view of an object is an important and challenging problem in computer vision. Previous methods mainly focus on one specific class of objects without large topological changes, such as cars, faces, or human bodies. In this paper, we propose a novel single view reconstruction algorithm for symmetric piece-wise planar objects that are not restricted to some object classes. Symmetry is ubiquitous in manmade and natural objects and provides rich information for 3D reconstruction. Given a single view of a symmetric piecewise planar object, we first find out all the symmetric line pairs. The geometric properties of symmetric objects are used to narrow down the searching space. Then, based on the symmetric lines, a depth map is recovered through a Markov random field. Experimental results show that our algorithm can efficiently recover the 3D shapes of different objects with significant topological variations.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117312731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995615
Bing Li, Weihua Xiong, Weiming Hu, Ou Wu
{"title":"Evaluating combinational color constancy methods on real-world images","authors":"Bing Li, Weihua Xiong, Weiming Hu, Ou Wu","doi":"10.1109/CVPR.2011.5995615","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995615","url":null,"abstract":"Light color estimation is crucial to the color constancy problem. Past decades have witnessed great progress in solving this problem. Contrary to traditional methods, many researchers propose a variety of combinational color constancy methods through applying different color constancy mathematical models on an image simultaneously and then give out a final estimation in diverse ways. Although many comprehensive evaluations or reviews about color constancy methods are available, few focus on combinational strategies. In this paper, we survey some prevailing combinational strategies systematically; divide them into three categories and compare them qualitatively on three real-world image data sets in terms of the angular error and the perceptual Euclidean distance. The experimental results show that combinational strategies with training procedure always produces better performance.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117337954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995325
Xingwei Yang, Longin Jan Latecki
{"title":"Affinity learning on a tensor product graph with applications to shape and image retrieval","authors":"Xingwei Yang, Longin Jan Latecki","doi":"10.1109/CVPR.2011.5995325","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995325","url":null,"abstract":"As observed in several recent publications, improved retrieval performance is achieved when pairwise similarities between the query and the database objects are replaced with more global affinities that also consider the relation among the database objects. This is commonly achieved by propagating the similarity information in a weighted graph representing the database and query objects. Instead of propagating the similarity information on the original graph, we propose to utilize the tensor product graph (TPG) obtained by the tensor product of the original graph with itself. By virtue of this construction, not only local but also long range similarities among graph nodes are explicitly represented as higher order relations, making it possible to better reveal the intrinsic structure of the data manifold. In addition, we improve the local neighborhood structure of the original graph in a preprocessing stage. We illustrate the benefits of the proposed approach on shape and image ranking and retrieval tasks. We are able to achieve the bull's eye retrieval score of 99.99% on MPEG-7 shape dataset, which is much higher than the state-of-the-art algorithms.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123496763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995698
M. Wang, Xiaogang Wang
{"title":"Automatic adaptation of a generic pedestrian detector to a specific traffic scene","authors":"M. Wang, Xiaogang Wang","doi":"10.1109/CVPR.2011.5995698","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995698","url":null,"abstract":"In recent years significant progress has been made learning generic pedestrian detectors from manually labeled large scale training sets. However, when a generic pedestrian detector is applied to a specific scene where the testing data does not match with the training data because of variations of viewpoints, resolutions, illuminations and backgrounds, its accuracy may decrease greatly. In this paper, we propose a new framework of adapting a pre-trained generic pedestrian detector to a specific traffic scene by automatically selecting both confident positive and negative examples from the target scene to re-train the detector iteratively. An important feature of the proposed framework is to utilize unsupervisedly learned models of vehicle and pedestrian paths, together with multiple other cues such as locations, sizes, appearance and motions to select new training samples. The information of scene structures increases the reliability of selected samples and is complementary to the appearance-based detector. However, it was not well explored in previous studies. In order to further improve the reliability of selected samples, outliers are removed through multiple hierarchical clustering steps. The effectiveness of different cues and clustering steps is evaluated through experiments. The proposed approach significantly improves the accuracy of the generic pedestrian detector and also outperforms the scene specific detector retrained using background subtraction. Its results are comparable with the detector trained using a large number of manually labeled frames from the target scene.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123501983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CVPR 2011Pub Date : 2011-06-20DOI: 10.1109/CVPR.2011.5995436
Daniel Glasner, S. Vitaladevuni, R. Basri
{"title":"Contour-based joint clustering of multiple segmentations","authors":"Daniel Glasner, S. Vitaladevuni, R. Basri","doi":"10.1109/CVPR.2011.5995436","DOIUrl":"https://doi.org/10.1109/CVPR.2011.5995436","url":null,"abstract":"We present an unsupervised, shape-based method for joint clustering of multiple image segmentations. Given two or more closely-related images, such as nearby frames in a video sequence or images of the same scene taken under different lighting conditions, our method generates a joint segmentation of the images. We introduce a novel contour-based representation that allows us to cast the shape-based joint clustering problem as a quadratic semi-assignment problem. Our score function is additive. We use complex-valued affinities to assess the quality of matching the edge elements at the exterior bounding contour of clusters, while ignoring the contributions of elements that fall in the interior of the clusters. We further combine this contour-based score with region information and use a linear programming relaxation to solve for the joint clusters. We evaluate our approach on the occlusion boundary data-set of Stein et al.","PeriodicalId":445398,"journal":{"name":"CVPR 2011","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123508738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}