Wenzheng Chen, Huan Wang, Yangyan Li, Hao Su, Zhenhua Wang, Changhe Tu, D. Lischinski, D. Cohen-Or, Baoquan Chen
{"title":"Synthesizing Training Images for Boosting Human 3D Pose Estimation","authors":"Wenzheng Chen, Huan Wang, Yangyan Li, Hao Su, Zhenhua Wang, Changhe Tu, D. Lischinski, D. Cohen-Or, Baoquan Chen","doi":"10.1109/3DV.2016.58","DOIUrl":"https://doi.org/10.1109/3DV.2016.58","url":null,"abstract":"Human 3D pose estimation from a single image is a challenging task with numerous applications. Convolutional Neural Networks (CNNs) have recently achieved superior performance on the task of 2D pose estimation from a single image, by training on images with 2D annotations collected by crowd sourcing. This suggests that similar success could be achieved for direct estimation of 3D poses. However, 3D poses are much harder to annotate, and the lack of suitable annotated training images hinders attempts towards end-to-end solutions. To address this issue, we opt to automatically synthesize training images with ground truth pose annotations. Our work is a systematic study along this road. We find that pose space coverage and texture diversity are the key ingredients for the effectiveness of synthetic training data. We present a fully automatic, scalable approach that samples the human pose space for guiding the synthesis procedure and extracts clothing textures from real images. Furthermore, we explore domain adaptation for bridging the gap between our synthetic training images and real testing photos. We demonstrate that CNNs trained with our synthetic images out-perform those trained with real photos on 3D pose estimation tasks.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126948504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images","authors":"Stephen Lombardi, K. Nishino","doi":"10.1109/3DV.2016.39","DOIUrl":"https://doi.org/10.1109/3DV.2016.39","url":null,"abstract":"Recovering the radiometric properties of a scene (i.e., the reflectance, illumination, and geometry) is a long-sought ability of computer vision that can provide invaluable information for a wide range of applications. Deciphering the radiometric ingredients from the appearance of a real-world scene, as opposed to a single isolated object, is particularly challenging as it generally consists of various objects with different material compositions exhibiting complex reflectance and light interactions that are also part of the illumination. We introduce the first method for radiometric decomposition of real-world scenes that handles those intricacies. We use RGB-D images to bootstrap geometry recovery and simultaneously recover the complex reflectance and natural illumination while refining the noisy initial geometry and segmenting the scene into different material regions. Most important, we handle real-world scenes consisting of multiple objects of unknown materials, which necessitates the modeling of spatially-varying complex reflectance, natural illumination, texture, interreflection and shadows. We systematically evaluate the effectiveness of our method on synthetic scenes and demonstrate its application to real-world scenes. The results show that rich radiometric information can be recovered from RGB-D images and demonstrate a new role RGB-D sensors can play for general scene understanding tasks.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124973452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HDRFusion: HDR SLAM Using a Low-Cost Auto-Exposure RGB-D Sensor","authors":"Shuda Li, Ankur Handa, Yang Zhang, A. Calway","doi":"10.1109/3DV.2016.40","DOIUrl":"https://doi.org/10.1109/3DV.2016.40","url":null,"abstract":"Most dense RGB/RGB-D SLAM systems require the brightness of 3-D points observed from different viewpoints to be constant. However, in reality, this assumption is difficult to meet even when the surface is Lambertian and illumination is static. One cause is that most cameras automatically tune exposure to adapt to the wide dynamic range of scene radiance, violating the brightness assumption. We describe a novel system - HDRFusion - which turns this apparent drawback into an advantage by fusing LDR frames into an HDR textured volume using a standard RGB-D sensor with auto-exposure (AE) enabled. The key contribution is the use of a normalised metric for frame alignment which is invariant to changes in exposure time. This enables robust tracking in frame-to-model mode and also compensates the exposure accurately so that HDR texture, free of artefacts, can be generated online. We demonstrate that the tracking robustness and accuracy is greatly improved by the approach and that radiance maps can be generated with far greater dynamic range of scene radiance.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114107019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien P. C. Valentin, Angela Dai, M. Nießner, Pushmeet Kohli, Philip H. S. Torr, S. Izadi, Cem Keskin
{"title":"Learning to Navigate the Energy Landscape","authors":"Julien P. C. Valentin, Angela Dai, M. Nießner, Pushmeet Kohli, Philip H. S. Torr, S. Izadi, Cem Keskin","doi":"10.1109/3DV.2016.41","DOIUrl":"https://doi.org/10.1109/3DV.2016.41","url":null,"abstract":"In this paper, we present a novel, general, and efficient architecture for addressing computer vision problems that are approached from an 'Analysis by Synthesis' standpoint. Analysis by synthesis involves the minimization of reconstruction error, which is typically a non-convex function of the latent target variables. State-of-the-art methods adopt a hybrid scheme where discriminatively trained predictors like Random Forests or Convolutional Neural Networks are used to initialize local search algorithms. While these hybrid methods have been shown to produce promising results, they often get stuck in local optima. Our method goes beyond the conventional hybrid architecture by not only proposing multiple accurate initial solutions but by also defining a navigational structure over the solution space that can be used for extremely efficient gradient-free local search. We demonstrate the efficacy and generalizability of our approach on tasks as diverse as Hand Pose Estimation, RGB Camera Relocalization, and Image Retrieval.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127858887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolay Kobyshev, Hayko Riemenschneider, A. Bódis-Szomorú, L. Gool
{"title":"3D Saliency for Finding Landmark Buildings","authors":"Nikolay Kobyshev, Hayko Riemenschneider, A. Bódis-Szomorú, L. Gool","doi":"10.1109/3DV.2016.35","DOIUrl":"https://doi.org/10.1109/3DV.2016.35","url":null,"abstract":"In urban environments the most interesting and effective factors for localization and navigation are landmark buildings. This paper proposes a novel method to detect such buildings that stand out, i.e. would be given the status of 'landmark'. The method works in a fully unsupervised way, i.e. it can be applied to different cities without requiring annotation. First, salient points are detected, based on the analysis of their features as well as those found in their spatial neighborhood. Second, learning refines the points by finding connected landmark components and training a classifier to distinguish these from common building components. Third, landmark components are aggregated into complete landmark buildings. Experiments on city-scale point clouds show the viability and efficiency of our approach on various tasks.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125180548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}