具有全局时间一致性的视频到视频翻译

Xingxing Wei, Jun Zhu, Sitong Feng, Hang Su
{"title":"具有全局时间一致性的视频到视频翻译","authors":"Xingxing Wei, Jun Zhu, Sitong Feng, Hang Su","doi":"10.1145/3240508.3240708","DOIUrl":null,"url":null,"abstract":"Although image-to-image translation has been widely studied, the video-to-video translation is rarely mentioned. In this paper, we propose an unified video-to-video translation framework to accom- plish different tasks, like video super-resolution, video colouriza- tion, and video segmentation, etc. A consequent question within video-to-video translation lies in the flickering appearance along with the varying frames. To overcome this issue, a usual method is to incorporate the temporal loss between adjacent frames in the optimization, which is a kind of local frame-wise temporal con- sistency. We instead present a residual error based mechanism to ensure the video-level consistency of the same location in different frames (called (lobal temporal consistency). The global and local consistency are simultaneously integrated into our video-to-video framework to achieve more stable videos. Our method is based on the GAN framework, where we present a two-channel discrimina- tor. One channel is to encode the video RGB space, and another is to encode the residual error of the video as a whole to meet the global consistency. Extensive experiments conducted on different video- to-video translation tasks verify the effectiveness and flexibleness of the proposed method.","PeriodicalId":339857,"journal":{"name":"Proceedings of the 26th ACM international conference on Multimedia","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Video-to-Video Translation with Global Temporal Consistency\",\"authors\":\"Xingxing Wei, Jun Zhu, Sitong Feng, Hang Su\",\"doi\":\"10.1145/3240508.3240708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although image-to-image translation has been widely studied, the video-to-video translation is rarely mentioned. In this paper, we propose an unified video-to-video translation framework to accom- plish different tasks, like video super-resolution, video colouriza- tion, and video segmentation, etc. A consequent question within video-to-video translation lies in the flickering appearance along with the varying frames. To overcome this issue, a usual method is to incorporate the temporal loss between adjacent frames in the optimization, which is a kind of local frame-wise temporal con- sistency. We instead present a residual error based mechanism to ensure the video-level consistency of the same location in different frames (called (lobal temporal consistency). The global and local consistency are simultaneously integrated into our video-to-video framework to achieve more stable videos. Our method is based on the GAN framework, where we present a two-channel discrimina- tor. One channel is to encode the video RGB space, and another is to encode the residual error of the video as a whole to meet the global consistency. Extensive experiments conducted on different video- to-video translation tasks verify the effectiveness and flexibleness of the proposed method.\",\"PeriodicalId\":339857,\"journal\":{\"name\":\"Proceedings of the 26th ACM international conference on Multimedia\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3240508.3240708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240508.3240708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

虽然图像到图像的翻译已经被广泛研究,但视频到视频的翻译却很少被提及。在本文中,我们提出了一个统一的视频到视频翻译框架来完成不同的任务,如视频超分辨率、视频彩色化和视频分割等。在视频到视频的翻译中,随之而来的问题是随着帧的变化而出现的闪烁现象。为了克服这一问题,通常的方法是在优化中考虑相邻帧之间的时间损失,这是一种局部逐帧时间一致性。我们提出了一种基于残差的机制来确保同一位置在不同帧中的视频级一致性(称为全局时间一致性)。同时将全局一致性和局部一致性整合到我们的视频到视频框架中,以实现更稳定的视频。我们的方法是基于GAN框架,其中我们提出了一个双通道鉴别器。一个通道是对视频RGB空间进行编码,另一个通道是将视频的残差作为一个整体进行编码,以满足全局一致性。在不同的视频到视频翻译任务上进行的大量实验验证了该方法的有效性和灵活性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Video-to-Video Translation with Global Temporal Consistency
Although image-to-image translation has been widely studied, the video-to-video translation is rarely mentioned. In this paper, we propose an unified video-to-video translation framework to accom- plish different tasks, like video super-resolution, video colouriza- tion, and video segmentation, etc. A consequent question within video-to-video translation lies in the flickering appearance along with the varying frames. To overcome this issue, a usual method is to incorporate the temporal loss between adjacent frames in the optimization, which is a kind of local frame-wise temporal con- sistency. We instead present a residual error based mechanism to ensure the video-level consistency of the same location in different frames (called (lobal temporal consistency). The global and local consistency are simultaneously integrated into our video-to-video framework to achieve more stable videos. Our method is based on the GAN framework, where we present a two-channel discrimina- tor. One channel is to encode the video RGB space, and another is to encode the residual error of the video as a whole to meet the global consistency. Extensive experiments conducted on different video- to-video translation tasks verify the effectiveness and flexibleness of the proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信