聚焦和保留:补充人体图像合成中的破碎姿势

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI:10.1109/WACV48630.2021.00341

Pu Ge, Qiushi Huang, Wei Xiang, Xue Jing, Yule Li, Yiyong Li, Zhun Sun

{"title":"聚焦和保留:补充人体图像合成中的破碎姿势","authors":"Pu Ge, Qiushi Huang, Wei Xiang, Xue Jing, Yule Li, Yiyong Li, Zhun Sun","doi":"10.1109/WACV48630.2021.00341","DOIUrl":null,"url":null,"abstract":"Given a target pose, how to generate an image of a specific style with that target pose remains an ill-posed and thus complicated problem. Most recent works treat the human pose synthesis tasks as an image spatial transformation problem using flow warping techniques. However, we observe that, due to the inherent ill-posed nature of many complicated human poses, former methods fail to generate body parts. To tackle this problem, we propose a feature-level flow attention module and an Enhancer Network. The flow attention module produces a flow attention mask to guide the combination of the flow-warped features and the structural pose features. Then, we apply the Enhancer Network to re-fine the coarse image by injecting the pose information. We present our experimental evaluation both qualitatively and quantitatively on DeepFashion, Market-1501, and Youtube dance datasets. Quantitative results show that our method has 12.995 FID at DeepFashion, 25.459 FID at Market-1501, 14.516 FID at Youtube dance datasets, which outperforms some state-of-the-arts including Guide-Pixe2Pixe, Global-Flow-Local-Attn, and CocosNet.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Focus and retain: Complement the Broken Pose in Human Image Synthesis\",\"authors\":\"Pu Ge, Qiushi Huang, Wei Xiang, Xue Jing, Yule Li, Yiyong Li, Zhun Sun\",\"doi\":\"10.1109/WACV48630.2021.00341\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a target pose, how to generate an image of a specific style with that target pose remains an ill-posed and thus complicated problem. Most recent works treat the human pose synthesis tasks as an image spatial transformation problem using flow warping techniques. However, we observe that, due to the inherent ill-posed nature of many complicated human poses, former methods fail to generate body parts. To tackle this problem, we propose a feature-level flow attention module and an Enhancer Network. The flow attention module produces a flow attention mask to guide the combination of the flow-warped features and the structural pose features. Then, we apply the Enhancer Network to re-fine the coarse image by injecting the pose information. We present our experimental evaluation both qualitatively and quantitatively on DeepFashion, Market-1501, and Youtube dance datasets. Quantitative results show that our method has 12.995 FID at DeepFashion, 25.459 FID at Market-1501, 14.516 FID at Youtube dance datasets, which outperforms some state-of-the-arts including Guide-Pixe2Pixe, Global-Flow-Local-Attn, and CocosNet.\",\"PeriodicalId\":236300,\"journal\":{\"name\":\"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV48630.2021.00341\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV48630.2021.00341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

给定目标姿态，如何用目标姿态生成特定风格的图像仍然是一个病态的、复杂的问题。最近的研究将人体姿态合成任务视为使用流扭曲技术的图像空间转换问题。然而，我们观察到，由于许多复杂人体姿势固有的病态性质，以前的方法无法生成身体部位。为了解决这个问题，我们提出了一个特征级流注意模块和一个增强网络。流注意模块产生一个流注意掩模来引导流扭曲特征和结构姿态特征的结合。然后，利用增强器网络注入姿态信息，对粗图像进行重新细化。我们对DeepFashion、Market-1501和Youtube舞蹈数据集进行了定性和定量的实验评估。定量结果表明，我们的方法在DeepFashion上的FID为12.995，在Market-1501上的FID为25.459，在Youtube舞蹈数据集上的FID为14.516，优于Guide-Pixe2Pixe, Global-Flow-Local-Attn和CocosNet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Focus and retain: Complement the Broken Pose in Human Image Synthesis

Given a target pose, how to generate an image of a specific style with that target pose remains an ill-posed and thus complicated problem. Most recent works treat the human pose synthesis tasks as an image spatial transformation problem using flow warping techniques. However, we observe that, due to the inherent ill-posed nature of many complicated human poses, former methods fail to generate body parts. To tackle this problem, we propose a feature-level flow attention module and an Enhancer Network. The flow attention module produces a flow attention mask to guide the combination of the flow-warped features and the structural pose features. Then, we apply the Enhancer Network to re-fine the coarse image by injecting the pose information. We present our experimental evaluation both qualitatively and quantitatively on DeepFashion, Market-1501, and Youtube dance datasets. Quantitative results show that our method has 12.995 FID at DeepFashion, 25.459 FID at Market-1501, 14.516 FID at Youtube dance datasets, which outperforms some state-of-the-arts including Guide-Pixe2Pixe, Global-Flow-Local-Attn, and CocosNet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量