{"title":"一致的长序列深面","authors":"Xudong Fan, Daniele Bonatto, G. Lafruit","doi":"10.1109/IC3D48390.2019.8975999","DOIUrl":null,"url":null,"abstract":"Face swapping in videos usually has strong entertainment applications. Deep Fakes (in Faces) are a recent topic in deep learning where the main idea is to substitute the face of a person in a video with the face of another person. But one of the drawbacks of the method is that between two successive frames there are inconsistencies between the faces, such as changing face color, flickering or eyebrows that change. In this paper, we propose a convolutional neural network for swapping faces based on two autoencoders which share the same encoder. In this network, the encoder can distinguish and extract important features of faces, including facial expressions and poses; the decoders will then reconstruct faces according to these features. First, we will generate datasets of faces respectively for person A and person B. Secondly, the local information of two faces is sent to the network to get the model; after the training process, we can use the model to reconstruct the corresponding face of person B when the input is one face of person A. Afterwards, we build a binary mask to select the face area and transfer color from the source face to the target face. Finally, we only need to use a seamless clone to merge the new faces back into the source frames to create a fake video. The experimental results show that the quality of the fake videos is improved significantly.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Consistent Long Sequences Deep Faces\",\"authors\":\"Xudong Fan, Daniele Bonatto, G. Lafruit\",\"doi\":\"10.1109/IC3D48390.2019.8975999\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Face swapping in videos usually has strong entertainment applications. Deep Fakes (in Faces) are a recent topic in deep learning where the main idea is to substitute the face of a person in a video with the face of another person. But one of the drawbacks of the method is that between two successive frames there are inconsistencies between the faces, such as changing face color, flickering or eyebrows that change. In this paper, we propose a convolutional neural network for swapping faces based on two autoencoders which share the same encoder. In this network, the encoder can distinguish and extract important features of faces, including facial expressions and poses; the decoders will then reconstruct faces according to these features. First, we will generate datasets of faces respectively for person A and person B. Secondly, the local information of two faces is sent to the network to get the model; after the training process, we can use the model to reconstruct the corresponding face of person B when the input is one face of person A. Afterwards, we build a binary mask to select the face area and transfer color from the source face to the target face. Finally, we only need to use a seamless clone to merge the new faces back into the source frames to create a fake video. The experimental results show that the quality of the fake videos is improved significantly.\",\"PeriodicalId\":344706,\"journal\":{\"name\":\"2019 International Conference on 3D Immersion (IC3D)\",\"volume\":\"130 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on 3D Immersion (IC3D)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3D48390.2019.8975999\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on 3D Immersion (IC3D)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3D48390.2019.8975999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
视频中的人脸交换通常具有很强的娱乐应用。Deep Fakes (in Faces)是深度学习领域最近的一个话题,其主要思想是用另一个人的脸代替视频中的人的脸。但该方法的缺点之一是,在两个连续的帧之间,面部之间存在不一致,例如面部颜色的变化,闪烁或眉毛的变化。在本文中,我们提出了一种基于两个共享同一编码器的自编码器交换人脸的卷积神经网络。在该网络中,编码器可以区分和提取人脸的重要特征,包括面部表情和姿势;然后,解码器将根据这些特征重建人脸。首先,我们将分别生成A人和b人的人脸数据集,然后将两张人脸的局部信息发送到网络中得到模型;训练过程结束后,当输入是a人的一张脸时,我们可以使用该模型重构出B人对应的脸。然后,我们构建一个二值掩码来选择人脸区域,并将源人脸的颜色转移到目标人脸。最后,我们只需要使用无缝克隆将新面孔合并回源帧以创建假视频。实验结果表明,伪视频的质量得到了显著提高。
Face swapping in videos usually has strong entertainment applications. Deep Fakes (in Faces) are a recent topic in deep learning where the main idea is to substitute the face of a person in a video with the face of another person. But one of the drawbacks of the method is that between two successive frames there are inconsistencies between the faces, such as changing face color, flickering or eyebrows that change. In this paper, we propose a convolutional neural network for swapping faces based on two autoencoders which share the same encoder. In this network, the encoder can distinguish and extract important features of faces, including facial expressions and poses; the decoders will then reconstruct faces according to these features. First, we will generate datasets of faces respectively for person A and person B. Secondly, the local information of two faces is sent to the network to get the model; after the training process, we can use the model to reconstruct the corresponding face of person B when the input is one face of person A. Afterwards, we build a binary mask to select the face area and transfer color from the source face to the target face. Finally, we only need to use a seamless clone to merge the new faces back into the source frames to create a fake video. The experimental results show that the quality of the fake videos is improved significantly.