2019 IEEE/CVF International Conference on Computer Vision (ICCV)最新文献_第10页

Stochastic Exposure Coding for Handling Multi-ToF-Camera Interference 随机曝光编码处理多tof相机干扰

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00797

Jongho Lee, Mohit Gupta

{"title":"Stochastic Exposure Coding for Handling Multi-ToF-Camera Interference","authors":"Jongho Lee, Mohit Gupta","doi":"10.1109/ICCV.2019.00797","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00797","url":null,"abstract":"As continuous-wave time-of-flight (C-ToF) cameras become popular in 3D imaging applications, they need to contend with the problem of multi-camera interference (MCI). In a multi-camera environment, a ToF camera may receive light from the sources of other cameras, resulting in large depth errors. In this paper, we propose stochastic exposure coding (SEC), a novel approach for mitigating. SEC involves dividing a camera's integration time into multiple slots, and switching the camera off and on stochastically during each slot. This approach has two benefits. First, by appropriately choosing the on probability for each slot, the camera can effectively filter out both the AC and DC components of interfering signals, thereby mitigating depth errors while also maintaining high signal-to-noise ratio. This enables high accuracy depth recovery with low power consumption. Second, this approach can be implemented without modifying the C-ToF camera's coding functions, and thus, can be used with a wide range of cameras with minimal changes. We demonstrate the performance benefits of SEC with theoretical analysis, simulations and real experiments, across a wide range of imaging scenarios.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"201 1","pages":"7879-7887"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88864205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles 使用多个自主微型飞行器的无标记户外人体动作捕捉

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00091

Nitin Saini, E. Price, Rahul Tallamraju, R. Enficiaud, R. Ludwig, Igor Martinovic, Aamir Ahmad, Michael J. Black

{"title":"Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles","authors":"Nitin Saini, E. Price, Rahul Tallamraju, R. Enficiaud, R. Ludwig, Igor Martinovic, Aamir Ahmad, Michael J. Black","doi":"10.1109/ICCV.2019.00091","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00091","url":null,"abstract":"Capturing human motion in natural scenarios means moving motion capture out of the lab and into the wild. Typical approaches rely on fixed, calibrated, cameras and reflective markers on the body, significantly limiting the motions that can be captured. To make motion capture truly unconstrained, we describe the first fully autonomous outdoor capture system based on flying vehicles. We use multiple micro-aerial-vehicles(MAVs), each equipped with a monocular RGB camera, an IMU, and a GPS receiver module. These detect the person, optimize their position, and localize themselves approximately. We then develop a markerless motion capture method that is suitable for this challenging scenario with a distant subject, viewed from above, with approximately calibrated and moving cameras. We combine multiple state-of-the-art 2D joint detectors with a 3D human body model and a powerful prior on human pose. We jointly optimize for 3D body pose and camera pose to robustly fit the 2D measurements. To our knowledge, this is the first successful demonstration of outdoor, full-body, markerless motion capture from autonomous flying vehicles.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"26 1","pages":"823-832"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87004476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Deep Learning for Light Field Saliency Detection 光场显著性检测的深度学习

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00893

Tiantian Wang, Yongri Piao, Huchuan Lu, Xiao Li, Lihe Zhang

{"title":"Deep Learning for Light Field Saliency Detection","authors":"Tiantian Wang, Yongri Piao, Huchuan Lu, Xiao Li, Lihe Zhang","doi":"10.1109/ICCV.2019.00893","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00893","url":null,"abstract":"Recent research in 4D saliency detection is limited by the deficiency of a large-scale 4D light field dataset. To address this, we introduce a new dataset to assist the subsequent research in 4D light field saliency detection. To the best of our knowledge, this is to date the largest light field dataset in which the dataset provides 1465 all-focus images with human-labeled ground truth masks and the corresponding focal stacks for every light field image. To verify the effectiveness of the light field data, we first introduce a fusion framework which includes two CNN streams where the focal stacks and all-focus images serve as the input. The focal stack stream utilizes a recurrent attention mechanism to adaptively learn to integrate every slice in the focal stack, which benefits from the extracted features of the good slices. Then it is incorporated with the output map generated by the all-focus stream to make the saliency prediction. In addition, we introduce adversarial examples by adding noise intentionally into images to help train the deep network, which can improve the robustness of the proposed network. The noise is designed by users, which is imperceptible but can fool the CNNs to make the wrong prediction. Extensive experiments show the effectiveness and superiority of the proposed model on the popular evaluation metrics. The proposed method performs favorably compared with the existing 2D, 3D and 4D saliency detection methods on the proposed dataset and existing LFSD light field dataset. The code and results can be found at https://github.com/OIPLab-DUT/ ICCV2019_Deeplightfield_Saliency. Moreover, to facilitate research in this field, all images we collected are shared in a ready-to-use manner.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"22 1","pages":"8837-8847"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87386933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 79

On Boosting Single-Frame 3D Human Pose Estimation via Monocular Videos 利用单目视频增强单帧3D人体姿态估计

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00228

Zhi Li, Xuan Wang, Fei Wang, Peilin Jiang

{"title":"On Boosting Single-Frame 3D Human Pose Estimation via Monocular Videos","authors":"Zhi Li, Xuan Wang, Fei Wang, Peilin Jiang","doi":"10.1109/ICCV.2019.00228","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00228","url":null,"abstract":"The premise of training an accurate 3D human pose estimation network is the possession of huge amount of richly annotated training data. Nonetheless, manually obtaining rich and accurate annotations is, even not impossible, tedious and slow. In this paper, we propose to exploit monocular videos to complement the training dataset for the single-image 3D human pose estimation tasks. At the beginning, a baseline model is trained with a small set of annotations. By fixing some reliable estimations produced by the resulting model, our method automatically collects the annotations across the entire video as solving the 3D trajectory completion problem. Then, the baseline model is further trained with the collected annotations to learn the new poses. We evaluate our method on the broadly-adopted Human3.6M and MPI-INF-3DHP datasets. As illustrated in experiments, given only a small set of annotations, our method successfully makes the model to learn new poses from unlabelled monocular videos, promoting the accuracies of the baseline model by about 10%. By contrast with previous approaches, our method does not rely on either multi-view imagery or any explicit 2D keypoint annotations.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"45 1","pages":"2192-2201"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87726684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Convex Shape Prior for Multi-Object Segmentation Using a Single Level Set Function 基于单水平集函数的凸形状先验多目标分割

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00070

Shousheng Luo, X. Tai, Limei Huo, Yang Wang, R. Glowinski

引用次数: 17

SANet: Scene Agnostic Network for Camera Localization SANet:用于摄像机定位的场景不可知网络

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00013

Luwei Yang, Ziqian Bai, Chengzhou Tang, Honghua Li, Yasutaka Furukawa, P. Tan

引用次数: 57

Neural Turtle Graphics for Modeling City Road Layouts 用于建模城市道路布局的神经龟图形

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00462

Hang Chu, Daiqing Li, David Acuna, Amlan Kar, Maria Shugrina, Xinkai Wei, Ming-Yu Liu, A. Torralba, S. Fidler

引用次数: 53

Spatial Correspondence With Generative Adversarial Network: Learning Depth From Monocular Videos 空间对应与生成对抗网络:从单目视频学习深度

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00759

Zhenyao Wu, Xinyi Wu, Xiaoping Zhang, Song Wang, L. Ju

{"title":"Spatial Correspondence With Generative Adversarial Network: Learning Depth From Monocular Videos","authors":"Zhenyao Wu, Xinyi Wu, Xiaoping Zhang, Song Wang, L. Ju","doi":"10.1109/ICCV.2019.00759","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00759","url":null,"abstract":"Depth estimation from monocular videos has important applications in many areas such as autonomous driving and robot navigation. It is a very challenging problem without knowing the camera pose since errors in camera-pose estimation can significantly affect the video-based depth estimation accuracy. In this paper, we present a novel SC-GAN network with end-to-end adversarial training for depth estimation from monocular videos without estimating the camera pose and pose change over time. To exploit cross-frame relations, SC-GAN includes a spatial correspondence module which uses Smolyak sparse grids to efficiently match the features across adjacent frames, and an attention mechanism to learn the importance of features in different directions. Furthermore, the generator in SC-GAN learns to estimate depth from the input frames, while the discriminator learns to distinguish between the ground-truth and estimated depth map for the reference frame. Experiments on the KITTI and Cityscapes datasets show that the proposed SC-GAN can achieve much more accurate depth maps than many existing state-of-the-art methods on monocular videos.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"55 1 1","pages":"7493-7503"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86070249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Deep Blind Hyperspectral Image Fusion

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00425

Wu Wang, Weihong Zeng, Yue Huang, Xinghao Ding, J. Paisley

引用次数: 52

Discriminative Feature Transformation for Occluded Pedestrian Detection 遮挡行人检测的判别特征变换

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI: 10.1109/ICCV.2019.00965

Chunluan Zhou, Ming Yang, Junsong Yuan

{"title":"Discriminative Feature Transformation for Occluded Pedestrian Detection","authors":"Chunluan Zhou, Ming Yang, Junsong Yuan","doi":"10.1109/ICCV.2019.00965","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00965","url":null,"abstract":"Despite promising performance achieved by deep con- volutional neural networks for non-occluded pedestrian de- tection, it remains a great challenge to detect partially oc- cluded pedestrians. Compared with non-occluded pedes- trian examples, it is generally more difficult to distinguish occluded pedestrian examples from background in featue space due to the missing of occluded parts. In this paper, we propose a discriminative feature transformation which en- forces feature separability of pedestrian and non-pedestrian examples to handle occlusions for pedestrian detection. Specifically, in feature space it makes pedestrian exam- ples approach the centroid of easily classified non-occluded pedestrian examples and pushes non-pedestrian examples close to the centroid of easily classified non-pedestrian ex- amples. Such a feature transformation partially compen- sates the missing contribution of occluded parts in feature space, therefore improving the performance for occluded pedestrian detection. We implement our approach in the Fast R-CNN framework by adding one transformation net- work branch. We validate the proposed approach on two widely used pedestrian detection datasets: Caltech and CityPersons. Experimental results show that our approach achieves promising performance for both non-occluded and occluded pedestrian detection.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"327 1","pages":"9556-9565"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86778241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41