Probabilistic Video Prediction From Noisy Data With a Posterior Confidence

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI:10.1109/CVPR42600.2020.01084

Yunbo Wang, Jiajun Wu, Mingsheng Long, J. Tenenbaum

{"title":"Probabilistic Video Prediction From Noisy Data With a Posterior Confidence","authors":"Yunbo Wang, Jiajun Wu, Mingsheng Long, J. Tenenbaum","doi":"10.1109/CVPR42600.2020.01084","DOIUrl":null,"url":null,"abstract":"We study a new research problem of probabilistic future frames prediction from a sequence of noisy inputs, which is useful because it is difficult to guarantee the quality of input frames in practical spatiotemporal prediction applications. It is also challenging because it involves two levels of uncertainty: the perceptual uncertainty from noisy observations and the dynamics uncertainty in forward modeling. In this paper, we propose to tackle this problem with an end-to-end trainable model named Bayesian Predictive Network (BP-Net). Unlike previous work in stochastic video prediction that assumes spatiotemporal coherence and therefore fails to deal with perceptual uncertainty, BP-Net models both levels of uncertainty in an integrated framework. Furthermore, unlike previous work that can only provide unsorted estimations of future frames, BP-Net leverages a differentiable sequential importance sampling (SIS) approach to make future predictions based on the inference of underlying physical states, thereby providing sorted prediction candidates in accordance with the SIS importance weights, i.e., the confidences. Our experiment results demonstrate that BP-Net remarkably outperforms existing approaches on predicting future frames from noisy data.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"22 1","pages":"10827-10836"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR42600.2020.01084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

We study a new research problem of probabilistic future frames prediction from a sequence of noisy inputs, which is useful because it is difficult to guarantee the quality of input frames in practical spatiotemporal prediction applications. It is also challenging because it involves two levels of uncertainty: the perceptual uncertainty from noisy observations and the dynamics uncertainty in forward modeling. In this paper, we propose to tackle this problem with an end-to-end trainable model named Bayesian Predictive Network (BP-Net). Unlike previous work in stochastic video prediction that assumes spatiotemporal coherence and therefore fails to deal with perceptual uncertainty, BP-Net models both levels of uncertainty in an integrated framework. Furthermore, unlike previous work that can only provide unsorted estimations of future frames, BP-Net leverages a differentiable sequential importance sampling (SIS) approach to make future predictions based on the inference of underlying physical states, thereby providing sorted prediction candidates in accordance with the SIS importance weights, i.e., the confidences. Our experiment results demonstrate that BP-Net remarkably outperforms existing approaches on predicting future frames from noisy data.

查看原文本刊更多论文

基于后验置信度的噪声数据概率视频预测

本文研究了基于噪声输入序列的未来帧概率预测问题，该问题在实际的时空预测应用中很难保证输入帧的质量。它还具有挑战性，因为它涉及两个层面的不确定性:来自噪声观测的感知不确定性和正演建模中的动态不确定性。在本文中，我们提出了一个端到端可训练模型贝叶斯预测网络(BP-Net)来解决这个问题。与之前的随机视频预测工作不同，BP-Net假设了时空一致性，因此无法处理感知的不确定性，BP-Net在一个综合框架中对两种不确定性水平进行了建模。此外，与以往只能提供未来帧的未排序估计不同，BP-Net利用可微顺序重要性抽样(SIS)方法根据底层物理状态的推断进行未来预测，从而根据SIS重要性权重(即置信度)提供排序的预测候选。我们的实验结果表明，BP-Net在从噪声数据预测未来帧方面明显优于现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量