2016 Fourth International Conference on 3D Vision (3DV)最新文献

筛选
英文 中文
Learning Camera Viewpoint Using CNN to Improve 3D Body Pose Estimation 使用CNN学习相机视点来改进3D身体姿势估计
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-09-18 DOI: 10.1109/3DV.2016.75
Mona Fathollahi Ghezelghieh, R. Kasturi, Sudeep Sarkar
{"title":"Learning Camera Viewpoint Using CNN to Improve 3D Body Pose Estimation","authors":"Mona Fathollahi Ghezelghieh, R. Kasturi, Sudeep Sarkar","doi":"10.1109/3DV.2016.75","DOIUrl":"https://doi.org/10.1109/3DV.2016.75","url":null,"abstract":"The objective of this work is to estimate 3D human pose from a single RGB image. Extracting image representations which incorporate both spatial relation of body parts and their relative depth plays an essential role in accurate3D pose reconstruction. In this paper, for the first time, we show that camera viewpoint in combination to 2D joint locations significantly improves 3D pose accuracy without the explicit use of perspective geometry mathematical models. To this end, we train a deep Convolutional Neural Net-work (CNN) to learn categorical camera viewpoint. To make the network robust against clothing and body shape of the subject in the image, we utilized 3D computer rendering to synthesize additional training images. We test our framework on the largest 3D pose estimation bench-mark, Human3.6m, and achieve up to 20% error reduction on standing-pose activities compared to the state-of-the-art approaches that do not use body part segmentation.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131299655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Dense Wide-Baseline Scene Flow from Two Handheld Video Cameras 密集的宽基线场景流从两个手持摄像机
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-09-16 DOI: 10.1109/3DV.2016.36
Christian Richardt, Hyeongwoo Kim, Levi Valgaerts, C. Theobalt
{"title":"Dense Wide-Baseline Scene Flow from Two Handheld Video Cameras","authors":"Christian Richardt, Hyeongwoo Kim, Levi Valgaerts, C. Theobalt","doi":"10.1109/3DV.2016.36","DOIUrl":"https://doi.org/10.1109/3DV.2016.36","url":null,"abstract":"We propose a new technique for computing dense scene flow from two handheld videos with wide camera baselines and different photometric properties due to different sensors or camera settings like exposure and white balance. Our technique innovates in two ways over existing methods: (1) it supports independently moving cameras, and (2) it computes dense scene flow for wide-baseline scenarios. We achieve this by combining state-of-the-art wide-baseline correspondence finding with a variational scene flow formulation. First, we compute dense, wide-baseline correspondences using DAISY descriptors for matching between cameras and over time. We then detect and replace occluded pixels in the correspondence fields using a novel edge-preserving Laplacian correspondence completion technique. We finally refine the computed correspondence fields in a variational scene flow formulation. We show dense scene flow results computed from challenging datasets with independently moving, handheld cameras of varying camera settings.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131055076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
SpectroMeter: Amortized Sublinear Spectral Approximation of Distance on Graphs 谱仪:图上距离的平摊亚线性谱近似
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-09-15 DOI: 10.1109/3DV.2016.60
R. Litman, A. Bronstein
{"title":"SpectroMeter: Amortized Sublinear Spectral Approximation of Distance on Graphs","authors":"R. Litman, A. Bronstein","doi":"10.1109/3DV.2016.60","DOIUrl":"https://doi.org/10.1109/3DV.2016.60","url":null,"abstract":"We present a method to approximate pairwise distance on a graph, having an amortized sub-linear complexity in its size. The proposed method follows the so called heat method due to Crane et al. The only additional input are the values of the eigenfunctions of the graph Laplacian at a subset of the vertices. Using these values we estimate a random walk from the source points, and normalize the result into a unit gradient function. The eigenfunctions are then used to synthesize distance values abiding by these constraints at desired locations. We show that this method works in practice on different types of inputs ranging from triangular meshes to general graphs. We also demonstrate that the resulting approximate distance is accurate enough to be used as the input to a recent method for intrinsic shape correspondence computation.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129822377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
3D Face Reconstruction by Learning from Synthetic Data 基于合成数据学习的三维人脸重建
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-09-14 DOI: 10.1109/3DV.2016.56
Elad Richardson, Matan Sela, R. Kimmel
{"title":"3D Face Reconstruction by Learning from Synthetic Data","authors":"Elad Richardson, Matan Sela, R. Kimmel","doi":"10.1109/3DV.2016.56","DOIUrl":"https://doi.org/10.1109/3DV.2016.56","url":null,"abstract":"Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications. Here, we introduce a learning-based approach for reconstructing a three-dimensional face from a single image. Recent face recovery methods rely on accurate localization of key characteristic points. In contrast, the proposed approach is based on a Convolutional-Neural-Network (CNN) which extracts the face geometry directly from its image. Although such deep architectures outperform other models in complex computer vision problems, training them properly requires a large dataset of annotated examples. In the case of three-dimensional faces, currently, there are no large volume data sets, while acquiring such big-data is a tedious task. As an alternative, we propose to generate random, yet nearly photo-realistic, facial images for which the geometric form is known. The suggested model successfully recovers facial shapes from real images, even for faces with extreme expressions and under various lighting conditions.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124361419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 283
Single-Image RGB Photometric Stereo with Spatially-Varying Albedo 具有空间变化反照率的单图像RGB光度立体
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-09-14 DOI: 10.1109/3DV.2016.34
Ayan Chakrabarti, Kalyan Sunkavalli
{"title":"Single-Image RGB Photometric Stereo with Spatially-Varying Albedo","authors":"Ayan Chakrabarti, Kalyan Sunkavalli","doi":"10.1109/3DV.2016.34","DOIUrl":"https://doi.org/10.1109/3DV.2016.34","url":null,"abstract":"We present a single-shot system to recover surface geometry of objects with spatially-varying albedos, from images captured under a calibrated RGB photometric stereo setup-with three light directions multiplexed across different color channels in the observed RGB image. Since the problem is ill-posed point-wise, we assume that the albedo map can be modeled as piece-wise constant with a restricted number of distinct albedo values. We show that under ideal conditions, the shape of a non-degenerate local constant albedo surface patch can theoretically be recovered exactly. Moreover, we present a practical and efficient algorithm that uses this model to robustly recover shape from real images. Our method first reasons about shape locally in a dense set of patches in the observed image, producing shape distributions for every patch. These local distributions are then combined to produce a single consistent surface normal map. We demonstrate the efficacy of the approach through experiments on both synthetic renderings as well as real captured images.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"371 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133933557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Multi-Body Non-Rigid Structure-from-Motion 基于运动的多体非刚性结构
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-07-15 DOI: 10.1109/3DV.2016.23
Suryansh Kumar, Yuchao Dai, Hongdong Li
{"title":"Multi-Body Non-Rigid Structure-from-Motion","authors":"Suryansh Kumar, Yuchao Dai, Hongdong Li","doi":"10.1109/3DV.2016.23","DOIUrl":"https://doi.org/10.1109/3DV.2016.23","url":null,"abstract":"In this paper, we present the first multi-body non-rigid structure-from-motion (SFM) method, which simultaneously reconstructs and segments multiple objects that are undergoing non-rigid deformation over time. Under our formulation, 3D trajectories for each non-rigid object can be well approximated with a sparse affine combination of other 3D trajectories from the same object. The resultant optimization is solved by the alternating direction method of multipliers (ADMM). We demonstrate the efficacy of the proposed method through extensive experiments on both synthetic and real data sequences. Our method outperforms other alternative methods, such as first clustering the 2D feature tracks to groups and then doing non-rigid reconstruction in each group or first conducting 3D reconstruction by using single subspace assumption and then clustering the 3D trajectories into groups.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124291522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Large Scale SfM with the Distributed Camera Model 基于分布式相机模型的大规模SfM
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-07-13 DOI: 10.1109/3DV.2016.31
Chris Sweeney, Victor Fragoso, Tobias Höllerer, M. Turk
{"title":"Large Scale SfM with the Distributed Camera Model","authors":"Chris Sweeney, Victor Fragoso, Tobias Höllerer, M. Turk","doi":"10.1109/3DV.2016.31","DOIUrl":"https://doi.org/10.1109/3DV.2016.31","url":null,"abstract":"We introduce the distributed camera model, a novel model for Structure-from-Motion (SfM). This model describes image observations in terms of light rays with ray origins and directions rather than pixels. As such, the proposed model is capable of describing a single camera or multiple cameras simultaneously as the collection of all light rays observed. We show how the distributed camera model is a generalization of the standard camera model and we describe a general formulation and solution to the absolute camera pose problem that works for standard or distributed cameras. The proposed method computes a solution that is up to 8 times more efficient and robust to rotation singularities in comparison with gDLS[21]. Finally, this method is used in an novel large-scale incremental SfM pipeline where distributed cameras are accurately and robustly merged together. This pipeline is a direct generalization of traditional incremental SfM, however, instead of incrementally adding one camera at a time to grow the reconstruction the reconstruction is grown by adding a distributed camera. Our pipeline produces highly accurate reconstructions efficiently by avoiding the need for many bundle adjustment iterations and is capable of computing a 3D model of Rome from over 15,000 images in just 22 minutes.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116684205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation V-Net:用于体积医学图像分割的全卷积神经网络
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-06-15 DOI: 10.1109/3DV.2016.79
F. Milletarì, N. Navab, Seyed-Ahmad Ahmadi
{"title":"V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation","authors":"F. Milletarì, N. Navab, Seyed-Ahmad Ahmadi","doi":"10.1109/3DV.2016.79","DOIUrl":"https://doi.org/10.1109/3DV.2016.79","url":null,"abstract":"Convolutional Neural Networks (CNNs) have been recently employed to solve problems from both the computer vision and medical image analysis fields. Despite their popularity, most approaches are only able to process 2D images while most medical data used in clinical practice consists of 3D volumes. In this work we propose an approach to 3D image segmentation based on a volumetric, fully convolutional, neural network. Our CNN is trained end-to-end on MRI volumes depicting prostate, and learns to predict segmentation for the whole volume at once. We introduce a novel objective function, that we optimise during training, based on Dice coefficient. In this way we can deal with situations where there is a strong imbalance between the number of foreground and background voxels. To cope with the limited number of annotated volumes available for training, we augment the data applying random non-linear transformations and histogram matching. We show in our experimental evaluation that our approach achieves good performances on challenging test data while requiring only a fraction of the processing time needed by other previous methods.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132926154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6402
Deeper Depth Prediction with Fully Convolutional Residual Networks 基于全卷积残差网络的深度预测
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-06-01 DOI: 10.1109/3DV.2016.32
Iro Laina, C. Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir Navab
{"title":"Deeper Depth Prediction with Fully Convolutional Residual Networks","authors":"Iro Laina, C. Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir Navab","doi":"10.1109/3DV.2016.32","DOIUrl":"https://doi.org/10.1109/3DV.2016.32","url":null,"abstract":"This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134063017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1598
Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks 深度卷积网络联合语义分割和深度估计
2016 Fourth International Conference on 3D Vision (3DV) Pub Date : 2016-04-25 DOI: 10.1109/3DV.2016.69
Arsalan Mousavian, H. Pirsiavash, J. Kosecka
{"title":"Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks","authors":"Arsalan Mousavian, H. Pirsiavash, J. Kosecka","doi":"10.1109/3DV.2016.69","DOIUrl":"https://doi.org/10.1109/3DV.2016.69","url":null,"abstract":"Multi-scale deep CNNs have been used successfully for problems mapping each pixel to a label, such as depth estimation and semantic segmentation. It has also been shown that such architectures are reusable and can be used for multiple tasks. These networks are typically trained independently for each task by varying the output layer(s) and training objective. In this work we present a new model for simultaneous depth estimation and semantic segmentation from a single RGB image. Our approach demonstrates the feasibility of training parts of the model for each task and then fine tuning the full, combined model on both tasks simultaneously using a single loss function. Furthermore we couple the deep CNN with fully connected CRF, which captures the contextual relationships and interactions between the semantic and depth cues improving the accuracy of the final results. The proposed model is trained and evaluated on NYUDepth V2 dataset [23] outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124671836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 126
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信