{"title":"RGB-Guided Hyperspectral Image Upsampling","authors":"HyeokHyen Kwon, Yu-Wing Tai","doi":"10.1109/ICCV.2015.43","DOIUrl":"https://doi.org/10.1109/ICCV.2015.43","url":null,"abstract":"Hyperspectral imaging usually lack of spatial resolution due to limitations of hardware design of imaging sensors. On the contrary, latest imaging sensors capture a RGB image with resolution of multiple times larger than a hyperspectral image. In this paper, we present an algorithm to enhance and upsample the resolution of hyperspectral images. Our algorithm consists of two stages: spatial upsampling stage and spectrum substitution stage. The spatial upsampling stage is guided by a high resolution RGB image of the same scene, and the spectrum substitution stage utilizes sparse coding to locally refine the upsampled hyperspectral image through dictionary substitution. Experiments show that our algorithm is highly effective and has outperformed state-of-the-art matrix factorization based approaches.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"307-315"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87079185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. An, Munawar Hayat, S. H. Khan, Bennamoun, F. Boussaïd, Ferdous Sohel
{"title":"Contractive Rectifier Networks for Nonlinear Maximum Margin Classification","authors":"S. An, Munawar Hayat, S. H. Khan, Bennamoun, F. Boussaïd, Ferdous Sohel","doi":"10.1109/ICCV.2015.289","DOIUrl":"https://doi.org/10.1109/ICCV.2015.289","url":null,"abstract":"To find the optimal nonlinear separating boundary with maximum margin in the input data space, this paper proposes Contractive Rectifier Networks (CRNs), wherein the hidden-layer transformations are restricted to be contraction mappings. The contractive constraints ensure that the achieved separating margin in the input space is larger than or equal to the separating margin in the output layer. The training of the proposed CRNs is formulated as a linear support vector machine (SVM) in the output layer, combined with two or more contractive hidden layers. Effective algorithms have been proposed to address the optimization challenges arising from contraction constraints. Experimental results on MNIST, CIFAR-10, CIFAR-100 and MIT-67 datasets demonstrate that the proposed contractive rectifier networks consistently outperform their conventional unconstrained rectifier network counterparts.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"2011 1","pages":"2515-2523"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86346738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves","authors":"Irina Nurutdinova, A. Fitzgibbon","doi":"10.1109/ICCV.2015.272","DOIUrl":"https://doi.org/10.1109/ICCV.2015.272","url":null,"abstract":"Modern structure from motion (SfM) remains dependent on point features to recover camera positions, meaning that reconstruction is severely hampered in low-texture environments, for example scanning a plain coffee cup on an uncluttered table. We show how 3D curves can be used to refine camera position estimation in challenging low-texture scenes. In contrast to previous work, we allow the curves to be partially observed in all images, meaning that for the first time, curve-based SfM can be demonstrated in realistic scenes. The algorithm is based on bundle adjustment, so needs an initial estimate, but even a poor estimate from a few point correspondences can be substantially improved by including curves, suggesting that this method would benefit many existing systems.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"35 1","pages":"2363-2371"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87425208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Hypothesis Tracking Revisited","authors":"Chanho Kim, Fuxin Li, A. Ciptadi, James M. Rehg","doi":"10.1109/ICCV.2015.533","DOIUrl":"https://doi.org/10.1109/ICCV.2015.533","url":null,"abstract":"This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The success of MHT largely depends on the ability to maintain a small list of potential hypotheses, which can be facilitated with the accurate object detectors that are currently available. We demonstrate that a classical MHT implementation from the 90's can come surprisingly close to the performance of state-of-the-art methods on standard benchmark datasets. In order to further utilize the strength of MHT in exploiting higher-order information, we introduce a method for training online appearance models for each track hypothesis. We show that appearance models can be learned efficiently via a regularized least squares framework, requiring only a few extra operations for each hypothesis branch. We obtain state-of-the-art results on popular tracking-by-detection datasets such as PETS and the recent MOT challenge.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"104 1","pages":"4696-4704"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80830221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Airborne Three-Dimensional Cloud Tomography","authors":"Aviad Levis, Y. Schechner, Amit Aides, A. Davis","doi":"10.1109/ICCV.2015.386","DOIUrl":"https://doi.org/10.1109/ICCV.2015.386","url":null,"abstract":"We seek to sense the three dimensional (3D) volumetric distribution of scatterers in a heterogenous medium. An important case study for such a medium is the atmosphere. Atmospheric contents and their role in Earth's radiation balance have significant uncertainties with regards to scattering components: aerosols and clouds. Clouds, made of water droplets, also lead to local effects as precipitation and shadows. Our sensing approach is computational tomography using passive multi-angular imagery. For light-matter interaction that accounts for multiple-scattering, we use the 3D radiative transfer equation as a forward model. Volumetric recovery by inverting this model suffers from a computational bottleneck on large scales, which include many unknowns. Steps taken make this tomography tractable, without approximating the scattering order or angle range.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"22 1","pages":"3379-3387"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81162086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The HCI Stereo Metrics: Geometry-Aware Performance Analysis of Stereo Algorithms","authors":"Katrin Honauer, L. Maier-Hein, D. Kondermann","doi":"10.1109/ICCV.2015.245","DOIUrl":"https://doi.org/10.1109/ICCV.2015.245","url":null,"abstract":"Performance characterization of stereo methods is mandatory to decide which algorithm is useful for which application. Prevalent benchmarks mainly use the root mean squared error (RMS) with respect to ground truth disparity maps to quantify algorithm performance. We show that the RMS is of limited expressiveness for algorithm selection and introduce the HCI Stereo Metrics. These metrics assess stereo results by harnessing three semantic cues: depth discontinuities, planar surfaces, and fine geometric structures. For each cue, we extract the relevant set of pixels from existing ground truth. We then apply our evaluation functions to quantify characteristics such as edge fattening and surface smoothness. We demonstrate that our approach supports practitioners in selecting the most suitable algorithm for their application. Using the new Middlebury dataset, we show that rankings based on our metrics reveal specific algorithm strengths and weaknesses which are not quantified by existing metrics. We finally show how stacked bar charts and radar charts visually support multidimensional performance evaluation. An interactive stereo benchmark based on the proposed metrics and visualizations is available at: http://hci.iwr.uni-heidelberg.de/stereometrics.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"30 1","pages":"2120-2128"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81434092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui
{"title":"Query Adaptive Similarity Measure for RGB-D Object Recognition","authors":"Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui","doi":"10.1109/ICCV.2015.25","DOIUrl":"https://doi.org/10.1109/ICCV.2015.25","url":null,"abstract":"This paper studies the problem of improving the top-1 accuracy of RGB-D object recognition. Despite of the impressive top-5 accuracies achieved by existing methods, their top-1 accuracies are not very satisfactory. The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations, and (2) effectively fusing RGB and depth cues is still an open problem. To address these problems, this paper first proposes a new similarity measure based on dense matching, through which objects in comparison are warped and aligned, to better tolerate variations. Towards RGB and depth fusion, we argue that a constant and golden weight doesn't exist. The two modalities have varying contributions when comparing objects from different categories. To capture such a dynamic characteristic, a group of matchers equipped with various fusion weights is constructed, to explore the responses of dense matching under different fusion configurations. All the response scores are finally merged following a learning-to-combination way, which provides quite good generalization ability in practice. The proposed approach win the best results on several public benchmarks, e.g., achieves 92.7% top-1 test accuracy on the Washington RGB-D object dataset, with a 5.1% improvement over the state-of-the-art.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"12 1","pages":"145-153"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83881035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motion Trajectory Segmentation via Minimum Cost Multicuts","authors":"M. Keuper, Bjoern Andres, T. Brox","doi":"10.1109/ICCV.2015.374","DOIUrl":"https://doi.org/10.1109/ICCV.2015.374","url":null,"abstract":"For the segmentation of moving objects in videos, the analysis of long-term point trajectories has been very popular recently. In this paper, we formulate the segmentation of a video sequence based on point trajectories as a minimum cost multicut problem. Unlike the commonly used spectral clustering formulation, the minimum cost multicut formulation gives natural rise to optimize not only for a cluster assignment but also for the number of clusters while allowing for varying cluster sizes. In this setup, we provide a method to create a long-term point trajectory graph with attractive and repulsive binary terms and outperform state-of-the-art methods based on spectral clustering on the FBMS-59 dataset and on the motion subtask of the VSB100 dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"16 1","pages":"3271-3279"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82680178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Fu, Chien-Yi Wang, Chen Chen, Changhu Wang, C.-C. Jay Kuo
{"title":"Robust Image Segmentation Using Contour-Guided Color Palettes","authors":"Xiang Fu, Chien-Yi Wang, Chen Chen, Changhu Wang, C.-C. Jay Kuo","doi":"10.1109/ICCV.2015.189","DOIUrl":"https://doi.org/10.1109/ICCV.2015.189","url":null,"abstract":"The contour-guided color palette (CCP) is proposed for robust image segmentation. It efficiently integrates contour and color cues of an image. To find representative colors of an image, color samples along long contours between regions, similar in spirit to machine learning methodology that focus on samples near decision boundaries, are collected followed by the mean-shift (MS) algorithm in the sampled color space to achieve an image-dependent color palette. This color palette provides a preliminary segmentation in the spatial domain, which is further fine-tuned by post-processing techniques such as leakage avoidance, fake boundary removal, and small region mergence. Segmentation performances of CCP and MS are compared and analyzed. While CCP offers an acceptable standalone segmentation result, it can be further integrated into the framework of layered spectral segmentation to produce a more robust segmentation. The superior performance of CCP-based segmentation algorithm is demonstrated by experiments on the Berkeley Segmentation Dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"1618-1625"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90045167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danhang Tang, Jonathan Taylor, Pushmeet Kohli, Cem Keskin, Tae-Kyun Kim, J. Shotton
{"title":"Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose","authors":"Danhang Tang, Jonathan Taylor, Pushmeet Kohli, Cem Keskin, Tae-Kyun Kim, J. Shotton","doi":"10.1109/ICCV.2015.380","DOIUrl":"https://doi.org/10.1109/ICCV.2015.380","url":null,"abstract":"We address the problem of hand pose estimation, formulated as an inverse problem. Typical approaches optimize an energy function over pose parameters using a 'black box' image generation procedure. This procedure knows little about either the relationships between the parameters or the form of the energy function. In this paper, we show that we can significantly improving upon black box optimization by exploiting high-level knowledge of the structure of the parameters and using a local surrogate energy function. Our new framework, hierarchical sampling optimization, consists of a sequence of predictors organized into a kinematic hierarchy. Each predictor is conditioned on its ancestors, and generates a set of samples over a subset of the pose parameters. The highly-efficient surrogate energy is used to select among samples. Having evaluated the full hierarchy, the partial pose samples are concatenated to generate a full-pose hypothesis. Several hypotheses are generated using the same procedure, and finally the original full energy function selects the best result. Experimental evaluation on three publically available datasets show that our method is particularly impressive in low-compute scenarios where it significantly outperforms all other state-of-the-art methods.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"11 1","pages":"3325-3333"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90270774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}