{"title":"SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels","authors":"Jianxiong Xiao, Andrew Owens, A. Torralba","doi":"10.1109/ICCV.2013.458","DOIUrl":"https://doi.org/10.1109/ICCV.2013.458","url":null,"abstract":"Existing scene understanding datasets contain only a limited set of views of a place, and they lack representations of complete 3D spaces. In this paper, we introduce SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places. The tasks that go into constructing such a dataset are difficult in isolation -- hand-labeling videos is painstaking, and structure from motion (SfM) is unreliable for large spaces. But if we combine them together, we make the dataset construction task much easier. First, we introduce an intuitive labeling tool that uses a partial reconstruction to propagate labels from one frame to another. Then we use the object labels to fix errors in the reconstruction. For this, we introduce a generalization of bundle adjustment that incorporates object-to-object correspondences. This algorithm works by constraining points for the same object from different frames to lie inside a fixed-size bounding box, parameterized by its rotation and translation. The SUN3D database, the source code for the generalized bundle adjustment, and the web-based 3D annotation tool are all available at http://sun3d.cs.princeton.edu.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"162 1","pages":"1625-1632"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73804295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Curved Symmetric Parts Using a Deformable Disc Model","authors":"T. S. Lee, S. Fidler, Sven J. Dickinson","doi":"10.1109/ICCV.2013.220","DOIUrl":"https://doi.org/10.1109/ICCV.2013.220","url":null,"abstract":"Symmetry is a powerful shape regularity that's been exploited by perceptual grouping researchers in both human and computer vision to recover part structure from an image without a priori knowledge of scene content. Drawing on the concept of a medial axis, defined as the locus of centers of maximal inscribed discs that sweep out a symmetric part, we model part recovery as the search for a sequence of deformable maximal inscribed disc hypotheses generated from a multiscale super pixel segmentation, a framework proposed by LEV09. However, we learn affinities between adjacent super pixels in a space that's invariant to bending and tapering along the symmetry axis, enabling us to capture a wider class of symmetric parts. Moreover, we introduce a global cost that perceptually integrates the hypothesis space by combining a pair wise and a higher-level smoothing term, which we minimize globally using dynamic programming. The new framework is demonstrated on two datasets, and is shown to significantly outperform the baseline LEV09.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"98 1","pages":"1753-1760"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74667975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval","authors":"Cai-Zhi Zhu, H. Jégou, S. Satoh","doi":"10.1109/ICCV.2013.214","DOIUrl":"https://doi.org/10.1109/ICCV.2013.214","url":null,"abstract":"Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the different roles of query and database. This paper first measure the extent of asymmetry on large-scale public datasets reflecting this task. Considering the standard bag-of-words representation, we then propose new asymmetrical dissimilarities accounting for the different inlier ratios associated with query and database images. These asymmetrical measures depend on the query, yet they are compatible with an inverted file structure, without noticeably impacting search efficiency. Our experiments show the benefit of our approach, and show that the visual object retrieval task is better treated asymmetrically, in the spirit of state-of-the-art text retrieval.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"18 1","pages":"1705-1712"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75739663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A General Dense Image Matching Framework Combining Direct and Feature-Based Costs","authors":"Jim Braux-Zin, R. Dupont, A. Bartoli","doi":"10.1109/ICCV.2013.30","DOIUrl":"https://doi.org/10.1109/ICCV.2013.30","url":null,"abstract":"Dense motion field estimation (typically optical flow, stereo disparity and surface registration) is a key computer vision problem. Many solutions have been proposed to compute small or large displacements, narrow or wide baseline stereo disparity, but a unified methodology is still lacking. We here introduce a general framework that robustly combines direct and feature-based matching. The feature-based cost is built around a novel robust distance function that handles key points and ``weak'' features such as segments. It allows us to use putative feature matches which may contain mismatches to guide dense motion estimation out of local minima. Our framework uses a robust direct data term (AD-Census). It is implemented with a powerful second order Total Generalized Variation regularization with external and self-occlusion reasoning. Our framework achieves state of the art performance in several cases (standard optical flow benchmarks, wide-baseline stereo and non-rigid surface registration). Our framework has a modular design that customizes to specific application needs.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"61 1","pages":"185-192"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72585974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nested Shape Descriptors","authors":"J. Byrne, Jianbo Shi","doi":"10.1109/ICCV.2013.152","DOIUrl":"https://doi.org/10.1109/ICCV.2013.152","url":null,"abstract":"In this paper, we propose a new family of binary local feature descriptors called nested shape descriptors. These descriptors are constructed by pooling oriented gradients over a large geometric structure called the Hawaiian earring, which is constructed with a nested correlation structure that enables a new robust local distance function called the nesting distance. This distance function is unique to the nested descriptor and provides robustness to outliers from order statistics. In this paper, we define the nested shape descriptor family and introduce a specific member called the seed-of-life descriptor. We perform a trade study to determine optimal descriptor parameters for the task of image matching. Finally, we evaluate performance compared to state-of-the-art local feature descriptors on the VGG-Affine image matching benchmark, showing significant performance gains. Our descriptor is the first binary descriptor to outperform SIFT on this benchmark.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"2 1","pages":"1201-1208"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75018597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tae-Hyun Oh, Hyeongwoo Kim, Yu-Wing Tai, J. Bazin, In-So Kweon
{"title":"Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision","authors":"Tae-Hyun Oh, Hyeongwoo Kim, Yu-Wing Tai, J. Bazin, In-So Kweon","doi":"10.1109/ICCV.2013.25","DOIUrl":"https://doi.org/10.1109/ICCV.2013.25","url":null,"abstract":"Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems. This observation motivates us to investigate whether there is a better alternative solution when using rank minimization. In this paper, instead of minimizing the nuclear norm, we propose to minimize the partial sum of singular values. The proposed objective function implicitly encourages the target rank constraint in rank minimization. Our experimental analyses show that our approach performs better than conventional rank minimization when the number of samples is deficient, while the solutions obtained by the two approaches are almost identical when the number of samples is more than sufficient. We apply our approach to various low-level vision problems, e.g. high dynamic range imaging, photometric stereo and image alignment, and show that our results outperform those obtained by the conventional nuclear norm rank minimization method.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"35 1","pages":"145-152"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75121871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, F. Clerc, P. Pérez
{"title":"Multi-view Object Segmentation in Space and Time","authors":"Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, F. Clerc, P. Pérez","doi":"10.1109/ICCV.2013.328","DOIUrl":"https://doi.org/10.1109/ICCV.2013.328","url":null,"abstract":"In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multi-view datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"33 1","pages":"2640-2647"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76587043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PM-Huber: PatchMatch with Huber Regularization for Stereo Matching","authors":"Philipp Heise, S. Klose, B. Jensen, Alois Knoll","doi":"10.1109/ICCV.2013.293","DOIUrl":"https://doi.org/10.1109/ICCV.2013.293","url":null,"abstract":"Most stereo correspondence algorithms match support windows at integer-valued disparities and assume a constant disparity value within the support window. The recently proposed Patch Match stereo algorithm by Bleyer et al. overcomes this limitation of previous algorithms by directly estimating planes. This work presents a method that integrates the Patch Match stereo algorithm into a variational smoothing formulation using quadratic relaxation. The resulting algorithm allows the explicit regularization of the disparity and normal gradients using the estimated plane parameters. Evaluation of our method in the Middlebury benchmark shows that our method outperforms the traditional integer-valued disparity strategy as well as the original algorithm and its variants in sub-pixel accurate disparity estimation.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"45 1","pages":"2360-2367"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77540809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Probabilistic Volumetric Models","authors":"Ali O. Ulusoy, O. Biris, J. Mundy","doi":"10.1109/ICCV.2013.68","DOIUrl":"https://doi.org/10.1109/ICCV.2013.68","url":null,"abstract":"This paper presents a probabilistic volumetric framework for image based modeling of general dynamic 3-d scenes. The framework is targeted towards high quality modeling of complex scenes evolving over thousands of frames. Extensive storage and computational resources are required in processing large scale space-time (4-d) data. Existing methods typically store separate 3-d models at each time step and do not address such limitations. A novel 4-d representation is proposed that adaptively subdivides in space and time to explain the appearance of 3-d dynamic surfaces. This representation is shown to achieve compression of 4-d data and provide efficient spatio-temporal processing. The advances of the proposed framework is demonstrated on standard datasets using free-viewpoint video and 3-d tracking applications.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"35 1","pages":"505-512"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81798680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Generalized Low-Rank Appearance Model for Spatio-temporally Correlated Rain Streaks","authors":"Yi-Lei Chen, Chiou-Ting Hsu","doi":"10.1109/ICCV.2013.247","DOIUrl":"https://doi.org/10.1109/ICCV.2013.247","url":null,"abstract":"In this paper, we propose a novel low-rank appearance model for removing rain streaks. Different from previous work, our method needs neither rain pixel detection nor time-consuming dictionary learning stage. Instead, as rain streaks usually reveal similar and repeated patterns on imaging scene, we propose and generalize a low-rank model from matrix to tensor structure in order to capture the spatio-temporally correlated rain streaks. With the appearance model, we thus remove rain streaks from image/video (and also other high-order image structure) in a unified way. Our experimental results demonstrate competitive (or even better) visual quality and efficient run-time in comparison with state of the art.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"15 1","pages":"1968-1975"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81825146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}