{"title":"Solving Large Scale Binary Quadratic Problems: Spectral Methods vs. Semidefinite Programming","authors":"Carl Olsson, Anders P. Eriksson, Fredrik Kahl","doi":"10.1109/CVPR.2007.383202","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383202","url":null,"abstract":"In this paper we introduce two new methods for solving binary quadratic problems. While spectral relaxation methods have been the workhorse subroutine for a wide variety of computer vision problems - segmentation, clustering, image restoration to name a few - it has recently been challenged by semidefinite programming (SDP) relaxations. In fact, it can be shown that SDP relaxations produce better lower bounds than spectral relaxations on binary problems with a quadratic objective function. On the other hand, the computational complexity for SDP increases rapidly as the number of decision variables grows making them inapplicable to large scale problems. Our methods combine the merits of both spectral and SDP relaxations -better (lower) bounds than traditional spectral methods and considerably faster execution times than SDP. The first method is based on spectral subgradients and can be applied to large scale SDPs with binary decision variables and the second one is based on the trust region problem. Both algorithms have been applied to several large scale vision problems with good performance.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116096140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Practical Online Active Learning for Classification","authors":"C. Monteleoni, Matti Kääriäinen","doi":"10.1109/CVPR.2007.383437","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383437","url":null,"abstract":"We compare the practical performance of several recently proposed algorithms for active learning in the online classification setting. We consider two active learning algorithms (and their combined variants) that are strongly online, in that they access the data sequentially and do not store any previously labeled examples, and for which formal guarantees have recently been proven under various assumptions. We motivate an optical character recognition (OCR) application that we argue to be appropriately served by online active learning. We compare the practical efficacy, for this application, of the algorithm variants, and show significant reductions in label-complexity over random sampling.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115292500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marc'Aurelio Ranzato, Fu Jie Huang, Y-Lan Boureau, Yann LeCun
{"title":"Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition","authors":"Marc'Aurelio Ranzato, Fu Jie Huang, Y-Lan Boureau, Yann LeCun","doi":"10.1109/CVPR.2007.383157","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383157","url":null,"abstract":"We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121213912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Region Classification with Markov Field Aspect Models","authors":"J. Verbeek, B. Triggs","doi":"10.1109/CVPR.2007.383098","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383098","url":null,"abstract":"Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-of-feature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhood-level couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov random fields. The models can be trained using either patch-level labels or image-level keywords. As input features they use factored observation models combining texture, color and position cues. Experimental results on the MSR Cambridge data sets show that combining spatial and aspect models significantly improves the region-level classification accuracy. In fact our models trained with image-level labels outperform PLSA trained with pixel-level ones.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123826353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Belief Propagation for Vision Using Linear Constraint Nodes","authors":"B. Potetz","doi":"10.1109/CVPR.2007.383094","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383094","url":null,"abstract":"Belief propagation over pairwise connected Markov random fields has become a widely used approach, and has been successfully applied to several important computer vision problems. However, pairwise interactions are often insufficient to capture the full statistics of the problem. Higher-order interactions are sometimes required. Unfortunately, the complexity of belief propagation is exponential in the size of the largest clique. In this paper, we introduce a new technique to compute belief propagation messages in time linear with respect to clique size for a large class of potential functions over real-valued variables. We demonstrate this technique in two applications. First, we perform efficient inference in graphical models where the spatial prior of natural images is captured by 2 times 2 cliques. This approach shows significant improvement over the commonly used pairwise-connected models, and may benefit a variety of applications using belief propagation to infer images or range images. Finally, we apply these techniques to shape-from-shading and demonstrate significant improvement over previous methods, both in quality and in flexibility.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121438057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shadow Removal in Front Projection Environments Using Object Tracking","authors":"S. Audet, J. Cooperstock","doi":"10.1109/CVPR.2007.383470","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383470","url":null,"abstract":"When an occluding object, such as a person, stands between a projector and a display surface, a shadow results. We can compensate by positioning multiple projectors so they produce identical and overlapping images and by using a system to locate shadows. Existing systems work by detecting either the shadows or the occluders. Shadow detection methods cannot remove shadows before they appear and are sensitive to video projection, while current occluder detection methods require near infrared cameras and illumination. Instead, we propose using a camera-based object tracker to locate the occluder and an algorithm to model the shadows. The algorithm can adapt to other tracking technologies as well. Despite imprecision in the calibration and tracking process, we found that our system performs effective shadow removal with sufficiently low processing delay for interactive applications with video projection.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121583453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Kernel Expansions for Image Classification","authors":"F. D. L. Torre, Oriol Vinyals","doi":"10.1109/CVPR.2007.383151","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383151","url":null,"abstract":"Kernel machines (e.g. SVM, KLDA) have shown state-of-the-art performance in several visual classification tasks. The classification performance of kernel machines greatly depends on the choice of kernels and its parameters. In this paper, we propose a method to search over a space of parameterized kernels using a gradient-descent based method. Our method effectively learns a non-linear representation of the data useful for classification and simultaneously performs dimensionality reduction. In addition, we suggest a new matrix formulation that simplifies and unifies previous approaches. The effectiveness and robustness of the proposed algorithm is demonstrated in both synthetic and real examples of pedestrian and mouth detection in images.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116771289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tie Liu, Zejian Yuan, Jian Sun, Jingdong Wang, N. Zheng, Xiaoou Tang, H. Shum
{"title":"Learning to Detect A Salient Object","authors":"Tie Liu, Zejian Yuan, Jian Sun, Jingdong Wang, N. Zheng, Xiaoou Tang, H. Shum","doi":"10.1109/cvpr.2007.383047","DOIUrl":"https://doi.org/10.1109/cvpr.2007.383047","url":null,"abstract":"We study visual attention by detecting a salient object in an input image. We formulate salient object detection as an image segmentation problem, where we separate the salient object from the image background. We propose a set of novel features including multi-scale contrast, center-surround histogram, and color spatial distribution to describe a salient object locally, regionally, and globally. A conditional random field is learned to effectively combine these features for salient object detection. We also constructed a large image database containing tens of thousands of carefully labeled images by multiple users. To our knowledge, it is the first large image database for quantitative evaluation of visual attention algorithms. We validate our approach on this image database, which is public available with this paper.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"760 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125154609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing Distribution-based Matching by Random Subsampling","authors":"A. Leung, S. Gong","doi":"10.1109/CVPR.2007.383183","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383183","url":null,"abstract":"We boost the efficiency and robustness of distribution-based matching by random subsampling which results in the minimum number of samples required to achieve a specified probability that a candidate sampling distribution is a good approximation to the model distribution. The improvement is demonstrated with applications to object detection, mean-shift tracking using color distributions and tracking with improved robustness for low-resolution video sequences. The problem of minimizing the number of samples required for robust distribution matching is formulated as a constrained optimization problem with the specified probability as the objective function. We show that surprisingly mean-shift tracking using our method requires very few samples. Our experiments demonstrate that robust tracking can be achieved with even as few as 5 random samples from the distribution of the target candidate. This leads to a considerably reduced computational complexity that is also independent of object size. We show that random subsampling speeds up tracking by two orders of magnitude for typical object sizes.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114287341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Cvejic, S. G. Nikolov, H. Knowles, A. Loza, A. Achim, D. Bull, C. N. Canagarajah
{"title":"The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video","authors":"N. Cvejic, S. G. Nikolov, H. Knowles, A. Loza, A. Achim, D. Bull, C. N. Canagarajah","doi":"10.1109/CVPR.2007.383433","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383433","url":null,"abstract":"This paper investigates the impact of pixel-level fusion of videos from visible (VIZ) and infrared (IR) surveillance cameras on object tracking performance, as compared to tracking in single modality videos. Tracking has been accomplished by means of a particle filter which fuses a colour cue and the structural similarity measure (SSIM). The highest tracking accuracy has been obtained in IR sequences, whereas the VIZ video showed the worst tracking performance due to higher levels of clutter. However, metrics for fusion assessment clearly point towards the supremacy of the multiresolutional methods, especially Dual Tree-Complex Wavelet Transform method. Thus, a new, tracking-oriented metric is needed that is able to accurately assess how fusion affects the performance of the tracker.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125231074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}