{"title":"3D human model guided pose transfer via progressive flow prediction network","authors":"Furong Ma , Guiyu Xia , Qingshan Liu","doi":"10.1016/j.jvcir.2024.104327","DOIUrl":null,"url":null,"abstract":"<div><div>Human pose transfer is to transfer a conditional person image to a new target pose. The difficulty lies in modeling the large-scale spatial deformation from the conditional pose to the target one. However, the commonly used 2D data representations and one-step flow prediction scheme lead to unreliable deformation prediction because of the lack of 3D information guidance and the great changes in the pose transfer. Therefore, to bring the original 3D motion information into human pose transfer, we propose to simulate the generation process of real person image. We drive the 3D human model reconstructed from the conditional person image with the target pose and project it to the 2D plane. The 2D projection thereby inherits the 3D information of the poses which can guide the flow prediction. Furthermore, we propose a progressive flow prediction network consisting of two streams. One stream is to predict the flow by decomposing the complex pose transformation into multiple sub-transformations. The other is to generate the features of the target image according to the predicted flow. Besides, to enhance the reliability of the generated invisible regions, we use the target pose information which contains structural information from the flow prediction stream as the supplementary information to the feature generation. The synthesized images with accurate depth information and sharp details demonstrate the effectiveness of the proposed method.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"105 ","pages":"Article 104327"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320324002839","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Human pose transfer is to transfer a conditional person image to a new target pose. The difficulty lies in modeling the large-scale spatial deformation from the conditional pose to the target one. However, the commonly used 2D data representations and one-step flow prediction scheme lead to unreliable deformation prediction because of the lack of 3D information guidance and the great changes in the pose transfer. Therefore, to bring the original 3D motion information into human pose transfer, we propose to simulate the generation process of real person image. We drive the 3D human model reconstructed from the conditional person image with the target pose and project it to the 2D plane. The 2D projection thereby inherits the 3D information of the poses which can guide the flow prediction. Furthermore, we propose a progressive flow prediction network consisting of two streams. One stream is to predict the flow by decomposing the complex pose transformation into multiple sub-transformations. The other is to generate the features of the target image according to the predicted flow. Besides, to enhance the reliability of the generated invisible regions, we use the target pose information which contains structural information from the flow prediction stream as the supplementary information to the feature generation. The synthesized images with accurate depth information and sharp details demonstrate the effectiveness of the proposed method.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.