任意长度人脸视频生成的一致性和同一性学习

2020 25th International Conference on Pattern Recognition (ICPR) Pub Date : 2021-01-10 DOI:10.1109/ICPR48806.2021.9412380

Shuquan Ye, Chu Han, Jiaying Lin, Guoqiang Han, Shengfeng He

{"title":"任意长度人脸视频生成的一致性和同一性学习","authors":"Shuquan Ye, Chu Han, Jiaying Lin, Guoqiang Han, Shengfeng He","doi":"10.1109/ICPR48806.2021.9412380","DOIUrl":null,"url":null,"abstract":"Face synthesis is an interesting yet challenging task in computer vision. It is even much harder to generate a portrait video than a single image. In this paper, we propose a novel video generation framework for synthesizing arbitrary-length face videos without any face exemplar or landmark. To overcome the synthesis ambiguity of face video, we propose a divide-and-conquer strategy to separately address the video face synthesis problem from two aspects, face identity synthesis and rearrangement. To this end, we design a cascaded network which contains three components, Identity-aware GAN (IA-GAN), Face Coherence Network, and Interpolation Network. IA-GAN is proposed to synthesize photorealistic faces with the same identity from a set of noises. Face Coherence Network is designed to re-arrange the faces generated by IA-GAN while keeping the inter-frame coherence. Interpolation Network is introduced to eliminate the discontinuity between two adjacent frames and improve the smoothness of the face video. Experimental results demonstrate that our proposed network is able to generate face video with high visual quality while preserving the identity. Statistics show that our method outperforms state-of-the-art unconditional face video generative models in multiple challenging datasets.","PeriodicalId":6783,"journal":{"name":"2020 25th International Conference on Pattern Recognition (ICPR)","volume":"31 1","pages":"915-922"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Coherence and Identity Learning for Arbitrary-length Face Video Generation\",\"authors\":\"Shuquan Ye, Chu Han, Jiaying Lin, Guoqiang Han, Shengfeng He\",\"doi\":\"10.1109/ICPR48806.2021.9412380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Face synthesis is an interesting yet challenging task in computer vision. It is even much harder to generate a portrait video than a single image. In this paper, we propose a novel video generation framework for synthesizing arbitrary-length face videos without any face exemplar or landmark. To overcome the synthesis ambiguity of face video, we propose a divide-and-conquer strategy to separately address the video face synthesis problem from two aspects, face identity synthesis and rearrangement. To this end, we design a cascaded network which contains three components, Identity-aware GAN (IA-GAN), Face Coherence Network, and Interpolation Network. IA-GAN is proposed to synthesize photorealistic faces with the same identity from a set of noises. Face Coherence Network is designed to re-arrange the faces generated by IA-GAN while keeping the inter-frame coherence. Interpolation Network is introduced to eliminate the discontinuity between two adjacent frames and improve the smoothness of the face video. Experimental results demonstrate that our proposed network is able to generate face video with high visual quality while preserving the identity. Statistics show that our method outperforms state-of-the-art unconditional face video generative models in multiple challenging datasets.\",\"PeriodicalId\":6783,\"journal\":{\"name\":\"2020 25th International Conference on Pattern Recognition (ICPR)\",\"volume\":\"31 1\",\"pages\":\"915-922\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 25th International Conference on Pattern Recognition (ICPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPR48806.2021.9412380\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 25th International Conference on Pattern Recognition (ICPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPR48806.2021.9412380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

人脸合成是计算机视觉领域一个有趣而又具有挑战性的课题。生成人像视频比生成单个图像要困难得多。在本文中，我们提出了一种新的视频生成框架，用于合成任意长度的人脸视频，而不需要任何人脸样本或地标。为了克服人脸视频的合成歧义，我们提出了分而治之的策略，分别从人脸身份合成和重排两个方面解决视频人脸合成问题。为此，我们设计了一个级联网络，该网络包含三个组件，身份感知GAN (IA-GAN)，人脸相干网络和插值网络。提出了一种从一组噪声中合成具有相同身份的逼真人脸的方法。人脸相干网络的目的是在保持帧间相干性的同时，对IA-GAN生成的人脸进行重新排列。为了消除相邻两帧之间的不连续，提高人脸视频的平滑度，引入了插值网络。实验结果表明，我们提出的网络能够在保持身份的前提下生成高视觉质量的人脸视频。统计数据表明，我们的方法在多个具有挑战性的数据集中优于最先进的无条件人脸视频生成模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Coherence and Identity Learning for Arbitrary-length Face Video Generation

Face synthesis is an interesting yet challenging task in computer vision. It is even much harder to generate a portrait video than a single image. In this paper, we propose a novel video generation framework for synthesizing arbitrary-length face videos without any face exemplar or landmark. To overcome the synthesis ambiguity of face video, we propose a divide-and-conquer strategy to separately address the video face synthesis problem from two aspects, face identity synthesis and rearrangement. To this end, we design a cascaded network which contains three components, Identity-aware GAN (IA-GAN), Face Coherence Network, and Interpolation Network. IA-GAN is proposed to synthesize photorealistic faces with the same identity from a set of noises. Face Coherence Network is designed to re-arrange the faces generated by IA-GAN while keeping the inter-frame coherence. Interpolation Network is introduced to eliminate the discontinuity between two adjacent frames and improve the smoothness of the face video. Experimental results demonstrate that our proposed network is able to generate face video with high visual quality while preserving the identity. Statistics show that our method outperforms state-of-the-art unconditional face video generative models in multiple challenging datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 25th International Conference on Pattern Recognition (ICPR)

自引率

0.00%

发文量