Controllable Video Generation with Sparse Trajectories

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI:10.1109/CVPR.2018.00819

Zekun Hao, Xun Huang, Serge J. Belongie

引用次数: 78

Abstract

Video generation and manipulation is an important yet challenging task in computer vision. Existing methods usually lack ways to explicitly control the synthesized motion. In this work, we present a conditional video generation model that allows detailed control over the motion of the generated video. Given the first frame and sparse motion trajectories specified by users, our model can synthesize a video with corresponding appearance and motion. We propose to combine the advantage of copying pixels from the given frame and hallucinating the lightness difference from scratch which help generate sharp video while keeping the model robust to occlusion and lightness change. We also propose a training paradigm that calculate trajectories from video clips, which eliminated the need of annotated training data. Experiments on several standard benchmarks demonstrate that our approach can generate realistic videos comparable to state-of-the-art video generation and video prediction methods while the motion of the generated videos can correspond well with user input.

查看原文本刊更多论文

稀疏轨迹的可控视频生成

视频的生成和处理是计算机视觉领域一个重要而又具有挑战性的课题。现有的方法通常缺乏明确控制合成运动的方法。在这项工作中，我们提出了一个条件视频生成模型，允许对生成视频的运动进行详细控制。给定第一帧和用户指定的稀疏运动轨迹，我们的模型可以合成具有相应外观和运动的视频。我们建议结合从给定帧复制像素和从头开始产生亮度差异的优势，这有助于生成清晰的视频，同时保持模型对遮挡和亮度变化的鲁棒性。我们还提出了一种从视频片段中计算轨迹的训练范式，从而消除了对带注释的训练数据的需求。在几个标准基准上的实验表明，我们的方法可以生成与最先进的视频生成和视频预测方法相当的逼真视频，而生成的视频的运动可以很好地与用户输入相对应。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量