Zeyuan Zhang, Guiyu Xia, Paike Yang, Wenkai Ye, Yubao Sun, Jia Liu
{"title":"姿态驱动的运动图像生成辅助的深度信息","authors":"Zeyuan Zhang, Guiyu Xia, Paike Yang, Wenkai Ye, Yubao Sun, Jia Liu","doi":"10.1145/3598151.3598177","DOIUrl":null,"url":null,"abstract":"Motion transfer which can be used as drive technology of the interaction between users and virtual roles has been a research hotspot in recent years. It is essentially a deformation process of human appearances, consequently motion transfer is typically regarded as a pose-guided image generation task which can be solved by a GAN based framework. However, the real motions occur in 3D space and the image generation in 2D plane inevitably lacks the depth information guidance of the original motions, which will result in the confusion of depths for different body parts. Besides, the adversarial loss of GAN presents weak constraints on the silhouette details. In this paper, we propose a two-stage GAN based model to make up for the defect of lacking depth information and improve the accuracy of the generated silhouette details. In stage-I, we propose a silhouette attention GAN with a silhouette consistency loss to generate the depth maps of target people. This not only brings the depth information of the original motions in 3D space but also aligns the body regions with reliable silhouettes for the following person image generation. In stage-II, we propose a context-enhanced GAN with the target poses and depth maps generated in the first stage as input to generate the final motion images. The generated results have reliable depth information and accurate silhouettes, demonstrating the effectiveness of the proposed model.","PeriodicalId":398644,"journal":{"name":"Proceedings of the 2023 3rd International Conference on Robotics and Control Engineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pose driven motion image generation aided by depth information\",\"authors\":\"Zeyuan Zhang, Guiyu Xia, Paike Yang, Wenkai Ye, Yubao Sun, Jia Liu\",\"doi\":\"10.1145/3598151.3598177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motion transfer which can be used as drive technology of the interaction between users and virtual roles has been a research hotspot in recent years. It is essentially a deformation process of human appearances, consequently motion transfer is typically regarded as a pose-guided image generation task which can be solved by a GAN based framework. However, the real motions occur in 3D space and the image generation in 2D plane inevitably lacks the depth information guidance of the original motions, which will result in the confusion of depths for different body parts. Besides, the adversarial loss of GAN presents weak constraints on the silhouette details. In this paper, we propose a two-stage GAN based model to make up for the defect of lacking depth information and improve the accuracy of the generated silhouette details. In stage-I, we propose a silhouette attention GAN with a silhouette consistency loss to generate the depth maps of target people. This not only brings the depth information of the original motions in 3D space but also aligns the body regions with reliable silhouettes for the following person image generation. In stage-II, we propose a context-enhanced GAN with the target poses and depth maps generated in the first stage as input to generate the final motion images. The generated results have reliable depth information and accurate silhouettes, demonstrating the effectiveness of the proposed model.\",\"PeriodicalId\":398644,\"journal\":{\"name\":\"Proceedings of the 2023 3rd International Conference on Robotics and Control Engineering\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 3rd International Conference on Robotics and Control Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3598151.3598177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 3rd International Conference on Robotics and Control Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3598151.3598177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pose driven motion image generation aided by depth information
Motion transfer which can be used as drive technology of the interaction between users and virtual roles has been a research hotspot in recent years. It is essentially a deformation process of human appearances, consequently motion transfer is typically regarded as a pose-guided image generation task which can be solved by a GAN based framework. However, the real motions occur in 3D space and the image generation in 2D plane inevitably lacks the depth information guidance of the original motions, which will result in the confusion of depths for different body parts. Besides, the adversarial loss of GAN presents weak constraints on the silhouette details. In this paper, we propose a two-stage GAN based model to make up for the defect of lacking depth information and improve the accuracy of the generated silhouette details. In stage-I, we propose a silhouette attention GAN with a silhouette consistency loss to generate the depth maps of target people. This not only brings the depth information of the original motions in 3D space but also aligns the body regions with reliable silhouettes for the following person image generation. In stage-II, we propose a context-enhanced GAN with the target poses and depth maps generated in the first stage as input to generate the final motion images. The generated results have reliable depth information and accurate silhouettes, demonstrating the effectiveness of the proposed model.