2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)最新文献_第9页

Dynamic Traffic Modeling From Overhead Imagery 来自头顶图像的动态交通建模

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.01233

Scott Workman, Nathan Jacobs

引用次数: 13

Probabilistic Video Prediction From Noisy Data With a Posterior Confidence 基于后验置信度的噪声数据概率视频预测

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.01084

Yunbo Wang, Jiajun Wu, Mingsheng Long, J. Tenenbaum

{"title":"Probabilistic Video Prediction From Noisy Data With a Posterior Confidence","authors":"Yunbo Wang, Jiajun Wu, Mingsheng Long, J. Tenenbaum","doi":"10.1109/CVPR42600.2020.01084","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.01084","url":null,"abstract":"We study a new research problem of probabilistic future frames prediction from a sequence of noisy inputs, which is useful because it is difficult to guarantee the quality of input frames in practical spatiotemporal prediction applications. It is also challenging because it involves two levels of uncertainty: the perceptual uncertainty from noisy observations and the dynamics uncertainty in forward modeling. In this paper, we propose to tackle this problem with an end-to-end trainable model named Bayesian Predictive Network (BP-Net). Unlike previous work in stochastic video prediction that assumes spatiotemporal coherence and therefore fails to deal with perceptual uncertainty, BP-Net models both levels of uncertainty in an integrated framework. Furthermore, unlike previous work that can only provide unsorted estimations of future frames, BP-Net leverages a differentiable sequential importance sampling (SIS) approach to make future predictions based on the inference of underlying physical states, thereby providing sorted prediction candidates in accordance with the SIS importance weights, i.e., the confidences. Our experiment results demonstrate that BP-Net remarkably outperforms existing approaches on predicting future frames from noisy data.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"22 1","pages":"10827-10836"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87670384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

What Does Plate Glass Reveal About Camera Calibration? 平板玻璃揭示了相机校准的什么?

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00309

Qian Zheng, Jinnan Chen, Zhangchi Lu, Boxin Shi, Xudong Jiang, Kim-Hui Yap, Ling-yu Duan, A. Kot

引用次数: 10

A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection A2dele:高效RGB-D显著目标检测的自适应和专注深度蒸馏器

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00908

Yongri Piao, Zhengkun Rong, Miao Zhang, Weisong Ren, Huchuan Lu

{"title":"A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection","authors":"Yongri Piao, Zhengkun Rong, Miao Zhang, Weisong Ren, Huchuan Lu","doi":"10.1109/CVPR42600.2020.00908","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00908","url":null,"abstract":"Existing state-of-the-art RGB-D salient object detection methods explore RGB-D data relying on a two-stream architecture, in which an independent subnetwork is required to process depth data. This inevitably incurs extra computational costs and memory consumption, and using depth data during testing may hinder the practical applications of RGB-D saliency detection. To tackle these two dilemmas, we propose a depth distiller (A2dele) to explore the way of using network prediction and attention as two bridges to transfer the depth knowledge from the depth stream to the RGB stream. First, by adaptively minimizing the differences between predictions generated from the depth stream and RGB stream, we realize the desired control of pixel-wise depth knowledge transferred to the RGB stream. Second, to transfer the localization knowledge to RGB features, we encourage consistencies between the dilated prediction of the depth stream and the attention map from the RGB stream. As a result, we achieve a lightweight architecture without use of depth data at test time by embedding our A2dele. Our extensive experimental evaluation on five benchmarks demonstrate that our RGB stream achieves state-of-the-art performance, which tremendously minimizes the model size by 76% and runs 12 times faster, compared with the best performing method. Furthermore, our A2dele can be applied to existing RGB-D networks to significantly improve their efficiency while maintaining performance (boosts FPS by nearly twice for DMRA and 3 times for CPFP).","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"64 1","pages":"9057-9066"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90428016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 155

Reflection Scene Separation From a Single Image 从单个图像的反射场景分离

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00247

Renjie Wan, Boxin Shi, Haoliang Li, Ling-yu Duan, A. Kot

引用次数: 17

PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation PandaNet:基于锚的单镜头多人3D姿态估计

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00689

Abdallah Benzine, Florian Chabot, B. Luvison, Q. Pham, C. Achard

{"title":"PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation","authors":"Abdallah Benzine, Florian Chabot, B. Luvison, Q. Pham, C. Achard","doi":"10.1109/cvpr42600.2020.00689","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00689","url":null,"abstract":"Recently, several deep learning models have been proposed for 3D human pose estimation. Nevertheless, most of these approaches only focus on the single-person case or estimate 3D pose of a few people at high resolution. Furthermore, many applications such as autonomous driving or crowd analysis require pose estimation of a large number of people possibly at low-resolution. In this work, we present PandaNet (Pose estimAtioN and Dectection Anchor-based Network), a new single-shot, anchor-based and multi-person 3D pose estimation approach. The proposed model performs bounding box detection and, for each detected person, 2D and 3D pose regression into a single forward pass. It does not need any post-processing to regroup joints since the network predicts a full 3D pose for each bounding box and allows the pose estimation of a possibly large number of people at low resolution. To manage people overlapping, we introduce a Pose-Aware Anchor Selection strategy. Moreover, as imbalance exists between different people sizes in the image, and joints coordinates have different uncertainties depending on these sizes, we propose a method to automatically optimize weights associated to different people scales and joints for efficient training. PandaNet surpasses previous single-shot methods on several challenging datasets: a multi-person urban virtual but very realistic dataset (JTA Dataset), and two real world 3D multi-person datasets (CMU Panoptic and MuPoTS-3D).","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"128 1","pages":"6855-6864"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74130086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Joint Filtering of Intensity Images and Neuromorphic Events for High-Resolution Noise-Robust Imaging 高分辨率噪声鲁棒成像中强度图像和神经形态事件的联合滤波

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00168

Zihao W. Wang, Peiqi Duan, O. Cossairt, A. Katsaggelos, Tiejun Huang, Boxin Shi

{"title":"Joint Filtering of Intensity Images and Neuromorphic Events for High-Resolution Noise-Robust Imaging","authors":"Zihao W. Wang, Peiqi Duan, O. Cossairt, A. Katsaggelos, Tiejun Huang, Boxin Shi","doi":"10.1109/cvpr42600.2020.00168","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00168","url":null,"abstract":"We present a novel computational imaging system with high resolution and low noise. Our system consists of a traditional video camera which captures high-resolution intensity images, and an event camera which encodes high-speed motion as a stream of asynchronous binary events. To process the hybrid input, we propose a unifying framework that first bridges the two sensing modalities via a noise-robust motion compensation model, and then performs joint image filtering. The filtered output represents the temporal gradient of the captured space-time volume, which can be viewed as motion-compensated event frames with high resolution and low noise. Therefore, the output can be widely applied to many existing event-based algorithms that are highly dependent on spatial resolution and noise robustness. In experimental results performed on both publicly available datasets as well as our contributing RGB-DAVIS dataset, we show systematic performance improvement in applications such as high frame-rate video synthesis, feature/corner detection and tracking, as well as high dynamic range image reconstruction.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"49 1","pages":"1606-1616"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74947919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

PFRL: Pose-Free Reinforcement Learning for 6D Pose Estimation PFRL:用于6D姿态估计的无姿态强化学习

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.01147

Jianzhun Shao, Yuhang Jiang, Gu Wang, Zhigang Li, Xiangyang Ji

引用次数: 25

Separating Particulate Matter From a Single Microscopic Image 从单个显微图像中分离颗粒物质

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00464

Tushar Sandhan, J. Choi

引用次数: 0

Searching for Actions on the Hyperbole 搜索对夸张的操作

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00122

Teng Long, P. Mettes, Heng Tao Shen, Cees G. M. Snoek

{"title":"Searching for Actions on the Hyperbole","authors":"Teng Long, P. Mettes, Heng Tao Shen, Cees G. M. Snoek","doi":"10.1109/cvpr42600.2020.00122","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00122","url":null,"abstract":"In this paper, we introduce hierarchical action search. Starting from the observation that hierarchies are mostly ignored in the action literature, we retrieve not only individual actions but also relevant and related actions, given an action name or video example as input. We propose a hyperbolic action network, which is centered around a hyperbolic space shared by action hierarchies and videos. Our discriminative hyperbolic embedding projects actions on the shared space while jointly optimizing hypernym-hyponym relations between action pairs and a large margin separation between all actions. The projected actions serve as hyperbolic prototypes that we match with projected video representations. The result is a learned space where videos are positioned in entailment cones formed by different subtrees. To perform search in this space, we start from a query and increasingly enlarge its entailment cone to retrieve hierarchically relevant action videos. Experiments on three action datasets with new hierarchy annotations show the effectiveness of our approach for hierarchical action search by name and by video example, regardless of whether queried actions have been seen or not during training. Our implementation is available at https://github.com/Tenglon/hyperbolic_action","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"205 1","pages":"1138-1147"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77180913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32