基于流形学习的深度运动转移视频预测方法

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI:10.1109/ICCVW54120.2021.00470

Yuliang Cai, S. Mohan, Adithya Niranjan, Nilesh Jain, A. Cloninger, Srinjoy Das

{"title":"基于流形学习的深度运动转移视频预测方法","authors":"Yuliang Cai, S. Mohan, Adithya Niranjan, Nilesh Jain, A. Cloninger, Srinjoy Das","doi":"10.1109/ICCVW54120.2021.00470","DOIUrl":null,"url":null,"abstract":"We propose a novel manifold learning based end-to-end prediction and video synthesis framework for bandwidth reduction in motion transfer enabled applications such as video conferencing. In our workflow we use keypoint based representations of video frames where image and motion specific information are encoded in a completely unsupervised manner. Prediction of future keypoints is then performed using the manifold of a variational recurrent neural network (VRNN) following which output video frames are synthesized using an optical flow estimator and a conditional image generator in the motion transfer pipeline. The proposed architecture which combines keypoint based representation of video frames with manifold learning based prediction enables significant additional bandwidth savings over motion transfer based video conferencing systems which are implemented solely using keypoint detection. We demonstrate the superiority of our technique using two representative datasets for both video reconstruction and transfer and show that prediction using VRNN has superior performance as compared to a non manifold based technique such as RNN.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Manifold Learning based Video Prediction approach for Deep Motion Transfer\",\"authors\":\"Yuliang Cai, S. Mohan, Adithya Niranjan, Nilesh Jain, A. Cloninger, Srinjoy Das\",\"doi\":\"10.1109/ICCVW54120.2021.00470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a novel manifold learning based end-to-end prediction and video synthesis framework for bandwidth reduction in motion transfer enabled applications such as video conferencing. In our workflow we use keypoint based representations of video frames where image and motion specific information are encoded in a completely unsupervised manner. Prediction of future keypoints is then performed using the manifold of a variational recurrent neural network (VRNN) following which output video frames are synthesized using an optical flow estimator and a conditional image generator in the motion transfer pipeline. The proposed architecture which combines keypoint based representation of video frames with manifold learning based prediction enables significant additional bandwidth savings over motion transfer based video conferencing systems which are implemented solely using keypoint detection. We demonstrate the superiority of our technique using two representative datasets for both video reconstruction and transfer and show that prediction using VRNN has superior performance as compared to a non manifold based technique such as RNN.\",\"PeriodicalId\":226794,\"journal\":{\"name\":\"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCVW54120.2021.00470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCVW54120.2021.00470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一种新颖的基于流形学习的端到端预测和视频合成框架，用于视频会议等运动传输应用的带宽减少。在我们的工作流程中，我们使用基于关键点的视频帧表示，其中图像和运动特定信息以完全无监督的方式编码。然后使用变分递归神经网络(VRNN)的流形进行未来关键点的预测，然后在运动传输管道中使用光流估计器和条件图像生成器合成输出视频帧。所提出的架构将基于关键点的视频帧表示与基于流形学习的预测相结合，与仅使用关键点检测实现的基于运动传输的视频会议系统相比，可以显著节省额外的带宽。我们使用两个用于视频重建和传输的代表性数据集证明了我们技术的优越性，并表明与基于非流形的技术(如RNN)相比，使用VRNN进行预测具有优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Manifold Learning based Video Prediction approach for Deep Motion Transfer

We propose a novel manifold learning based end-to-end prediction and video synthesis framework for bandwidth reduction in motion transfer enabled applications such as video conferencing. In our workflow we use keypoint based representations of video frames where image and motion specific information are encoded in a completely unsupervised manner. Prediction of future keypoints is then performed using the manifold of a variational recurrent neural network (VRNN) following which output video frames are synthesized using an optical flow estimator and a conditional image generator in the motion transfer pipeline. The proposed architecture which combines keypoint based representation of video frames with manifold learning based prediction enables significant additional bandwidth savings over motion transfer based video conferencing systems which are implemented solely using keypoint detection. We demonstrate the superiority of our technique using two representative datasets for both video reconstruction and transfer and show that prediction using VRNN has superior performance as compared to a non manifold based technique such as RNN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

自引率

0.00%

发文量