{"title":"Salient object detection via bootstrap learning","authors":"Na Tong, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang","doi":"10.1109/CVPR.2015.7298798","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298798","url":null,"abstract":"We propose a bootstrap learning algorithm for salient object detection in which both weak and strong models are exploited. First, a weak saliency map is constructed based on image priors to generate training samples for a strong model. Second, a strong classifier based on samples directly from an input image is learned to detect salient pixels. Results from multiscale saliency maps are integrated to further improve the detection performance. Extensive experiments on six benchmark datasets demonstrate that the proposed bootstrap learning algorithm performs favorably against the state-of-the-art saliency detection methods. Furthermore, we show that the proposed bootstrap learning approach can be easily applied to other bottom-up saliency models for significant improvement.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115576040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting Yao, Yingwei Pan, C. Ngo, Houqiang Li, Tao Mei
{"title":"Semi-supervised Domain Adaptation with Subspace Learning for visual recognition","authors":"Ting Yao, Yingwei Pan, C. Ngo, Houqiang Li, Tao Mei","doi":"10.1109/CVPR.2015.7298826","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298826","url":null,"abstract":"In many real-world applications, we are often facing the problem of cross domain learning, i.e., to borrow the labeled data or transfer the already learnt knowledge from a source domain to a target domain. However, simply applying existing source data or knowledge may even hurt the performance, especially when the data distribution in the source and target domain is quite different, or there are very few labeled data available in the target domain. This paper proposes a novel domain adaptation framework, named Semi-supervised Domain Adaptation with Subspace Learning (SDASL), which jointly explores invariant low-dimensional structures across domains to correct data distribution mismatch and leverages available unlabeled target examples to exploit the underlying intrinsic information in the target domain. Specifically, SDASL conducts the learning by simultaneously minimizing the classification error, preserving the structure within and across domains, and restricting similarity defined on unlabeled target examples. Encouraging results are reported for two challenging domain transfer tasks (including image-to-image and image-to-video transfers) on several standard datasets in the context of both image object recognition and video concept detection.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116026900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junjie Yan, Yinan Yu, Xiangyu Zhu, Zhen Lei, S. Li
{"title":"Object detection by labeling superpixels","authors":"Junjie Yan, Yinan Yu, Xiangyu Zhu, Zhen Lei, S. Li","doi":"10.1109/CVPR.2015.7299146","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299146","url":null,"abstract":"Object detection is often conducted by object proposal generation and classification sequentially. This paper handles object detection in a superpixel oriented manner instead of the proposal oriented. Specially, this paper takes object detection as a multi-label superpixel labeling problem by minimizing an energy function. It uses the data cost term to capture the appearance, smooth cost term to encode the spatial context and label cost term to favor compact detection. The data cost is learned through a convolutional neural network and the parameters in the labeling model are learned through a structural SVM. Compared with proposal generation and classification based methods, the proposed superpixel labeling method can naturally detect objects missed by proposal generation step and capture the global image context to infer the overlapping objects. The proposed method shows its advantage in Pascal VOC and ImageNet. Notably, it performs better than the ImageNet ILSVRC2014 winner GoogLeNet (45.0% V.S. 43.9% in mAP) with much shallower and fewer CNNs.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116398170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intra-frame deblurring by leveraging inter-frame camera motion","authors":"Haichao Zhang, Jianchao Yang","doi":"10.1109/CVPR.2015.7299030","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299030","url":null,"abstract":"Camera motion introduces motion blur, degrading the quality of video. A video deblurring method is proposed based on two observations: (i) camera motion within capture of each individual frame leads to motion blur; (ii) camera motion between frames yields inter-frame mis-alignment that can be exploited for blur removal. The proposed method effectively leverages the information distributed across multiple video frames due to camera motion, jointly estimating the motion between consecutive frames and blur within each frame. This joint analysis is crucial for achieving effective restoration by leveraging temporal information. Extensive experiments are carried out on synthetic data as well as real-world blurry videos. Comparisons with several state-of-the-art methods verify the effectiveness of the proposed method.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116536586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"R6P - Rolling shutter absolute pose problem","authors":"Cenek Albl, Z. Kukelova, T. Pajdla","doi":"10.1109/CVPR.2015.7298842","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298842","url":null,"abstract":"We present a minimal, non-iterative solution to the absolute pose problem for images from rolling shutter cameras. Absolute pose problem is a key problem in computer vision and rolling shutter is present in a vast majority of today's digital cameras. We propose several rolling shutter camera models and verify their feasibility for a polynomial solver. A solution based on linearized camera model is chosen and verified in several experiments. We use a linear approximation to the camera orientation, which is meaningful only around the identity rotation. We show that the standard P3P algorithm is able to estimate camera orientation within 6 degrees for camera rotation velocity as high as 30deg/frame. Therefore we can use the standard P3P algorithm to estimate camera orientation and to bring the camera rotation matrix close to the identity. Using this solution, camera position, orientation, translational velocity and angular velocity can be computed using six 2D-to-3D correspondences, with orientation error under half a degree and relative position error under 2%. A significant improvement in terms of the number of inliers in RANSAC is demonstrated.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122314984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yonggang Qi, Yi-Zhe Song, T. Xiang, Honggang Zhang, Timothy M. Hospedales, Yi Li, Jun Guo
{"title":"Making better use of edges via perceptual grouping","authors":"Yonggang Qi, Yi-Zhe Song, T. Xiang, Honggang Zhang, Timothy M. Hospedales, Yi Li, Jun Guo","doi":"10.1109/CVPR.2015.7298795","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298795","url":null,"abstract":"We propose a perceptual grouping framework that organizes image edges into meaningful structures and demonstrate its usefulness on various computer vision tasks. Our grouper formulates edge grouping as a graph partition problem, where a learning to rank method is developed to encode probabilities of candidate edge pairs. In particular, RankSVM is employed for the first time to combine multiple Gestalt principles as cue for edge grouping. Afterwards, an edge grouping based object proposal measure is introduced that yields proposals comparable to state-of-the-art alternatives. We further show how human-like sketches can be generated from edge groupings and consequently used to deliver state-of-the-art sketch-based image retrieval performance. Last but not least, we tackle the problem of freehand human sketch segmentation by utilizing the proposed grouper to cluster strokes into semantic object parts.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122410195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The common self-polar triangle of concentric circles and its application to camera calibration","authors":"H. Huang, Hui Zhang, Yiu-ming Cheung","doi":"10.1109/CVPR.2015.7299033","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299033","url":null,"abstract":"In projective geometry, the common self-polar triangle has often been used to discuss the position relationship of two planar conics. However, there are few researches on the properties of the common self-polar triangle, especially when the two planar conics are special conics. In this paper, we explore the properties of the common self-polar triangle, when the two conics happen to be concentric circles. We show there exist infinite many common self-polar triangles of two concentric circles, and provide a method to locate the vertices of these triangles. By investigating all these triangles, we find that they encode two important properties. The first one is all triangles share one common vertex, and the opposite side of the common vertex lies on the same line, which are the circle center and the line at the infinity of the support plane. The second is all triangles are right triangles. Based on these two properties, the imaged circle center and the varnishing line of support plane can be recovered simultaneously, and many conjugate pairs on vanishing line can be obtained. These allow to induce good constraints on the image of absolute conic. We evaluate two calibration algorithms, whereby accurate results are achieved. The main contribution of this paper is that we initiate a new perspective to look into circle-based camera calibration problem. We believe that other calibration methods using different circle patterns can benefit from this perspective, especially for the patterns which involve more than two circles.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122786371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"First-person pose recognition using egocentric workspaces","authors":"Grégory Rogez, J. Supančič, Deva Ramanan","doi":"10.1109/CVPR.2015.7299061","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299061","url":null,"abstract":"We tackle the problem of estimating the 3D pose of an individual's upper limbs (arms+hands) from a chest mounted depth-camera. Importantly, we consider pose estimation during everyday interactions with objects. Past work shows that strong pose+viewpoint priors and depth-based features are crucial for robust performance. In egocentric views, hands and arms are observable within a well defined volume in front of the camera. We call this volume an egocentric workspace. A notable property is that hand appearance correlates with workspace location. To exploit this correlation, we classify arm+hand configurations in a global egocentric coordinate frame, rather than a local scanning window. This greatly simplify the architecture and improves performance. We propose an efficient pipeline which 1) generates synthetic workspace exemplars for training using a virtual chest-mounted camera whose intrinsic parameters match our physical camera, 2) computes perspective-aware depth features on this entire volume and 3) recognizes discrete arm+hand pose classes through a sparse multi-class SVM. We achieve state-of-the-art hand pose recognition performance from egocentric RGB-D images in real-time.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122932301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reflection removal for in-vehicle black box videos","authors":"C. Simon, I. Park","doi":"10.1109/CVPR.2015.7299051","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299051","url":null,"abstract":"The in-vehicle black box camera (dashboard camera) has become a popular device in many countries for security monitoring and event capturing. The readability of video content is the most critical matter, however, the content is often degraded due to the windscreen reflection of objects inside. In this paper, we propose a novel method to remove the reflection on the windscreen from in-vehicle black box videos. The method exploits the spatio-temporal coherence of reflection, which states that a vehicle is moving forward while the reflection of the internal objects remains static. The average image prior is proposed by imposing a heavy-tail distribution with a higher peak to remove the reflection. The two-layered scene composed of reflection and background layers is the basis of the separation model. A non-convex cost function is developed based on this property and optimized in a fast way in a half quadratic form. Experimental results demonstrate that the proposed approach successfully separates the reflection layer in several real black box videos.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122837479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time coarse-to-fine topologically preserving segmentation","authors":"Jian Yao, Marko Boben, S. Fidler, R. Urtasun","doi":"10.1109/CVPR.2015.7298913","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298913","url":null,"abstract":"In this paper, we tackle the problem of unsupervised segmentation in the form of superpixels. Our main emphasis is on speed and accuracy. We build on [31] to define the problem as a boundary and topology preserving Markov random field. We propose a coarse to fine optimization technique that speeds up inference in terms of the number of updates by an order of magnitude. Our approach is shown to outperform [31] while employing a single iteration. We evaluate and compare our approach to state-of-the-art superpixel algorithms on the BSD and KITTI benchmarks. Our approach significantly outperforms the baselines in the segmentation metrics and achieves the lowest error on the stereo task.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122999427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}