滤波变形注意力 GAN:从少量图像构建人体运动视频

Jianjun Zhu, Huihuang Zhao, Yudong Zhang
{"title":"滤波变形注意力 GAN:从少量图像构建人体运动视频","authors":"Jianjun Zhu, Huihuang Zhao, Yudong Zhang","doi":"10.1007/s00371-024-03595-w","DOIUrl":null,"url":null,"abstract":"<p>Human motion transfer is challenging due to the complexity and diversity of human motion and clothing textures. Existing methods use 2D pose estimation to obtain poses, which can easily lead to unsmooth motion and artifacts. Therefore, this paper proposes a highly robust motion transmission model based on image deformation, called the Filter-Deform Attention Generative Adversarial Network (FDA GAN). This method can transmit complex human motion videos using only few human images. First, we use a 3D pose shape estimator instead of traditional 2D pose estimation to address the problem of unsmooth motion. Then, to tackle the artifact problem, we design a new attention mechanism and integrate it with the GAN, proposing a new network capable of effectively extracting image features and generating human motion videos. Finally, to further transfer the style of the source human, we propose a two-stream style loss, which enhances the model’s learning ability. Experimental results demonstrate that the proposed method outperforms recent methods in overall performance and various evaluation metrics. Project page: https://github.com/mioyeah/FDA-GAN.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Filter-deform attention GAN: constructing human motion videos from few images\",\"authors\":\"Jianjun Zhu, Huihuang Zhao, Yudong Zhang\",\"doi\":\"10.1007/s00371-024-03595-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Human motion transfer is challenging due to the complexity and diversity of human motion and clothing textures. Existing methods use 2D pose estimation to obtain poses, which can easily lead to unsmooth motion and artifacts. Therefore, this paper proposes a highly robust motion transmission model based on image deformation, called the Filter-Deform Attention Generative Adversarial Network (FDA GAN). This method can transmit complex human motion videos using only few human images. First, we use a 3D pose shape estimator instead of traditional 2D pose estimation to address the problem of unsmooth motion. Then, to tackle the artifact problem, we design a new attention mechanism and integrate it with the GAN, proposing a new network capable of effectively extracting image features and generating human motion videos. Finally, to further transfer the style of the source human, we propose a two-stream style loss, which enhances the model’s learning ability. Experimental results demonstrate that the proposed method outperforms recent methods in overall performance and various evaluation metrics. Project page: https://github.com/mioyeah/FDA-GAN.</p>\",\"PeriodicalId\":501186,\"journal\":{\"name\":\"The Visual Computer\",\"volume\":\"44 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Visual Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00371-024-03595-w\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03595-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于人体运动和服装纹理的复杂性和多样性,人体运动传输具有挑战性。现有方法使用二维姿态估计来获取姿态,这很容易导致运动不平滑和伪影。因此,本文提出了一种基于图像变形的高鲁棒性运动传输模型,称为滤波-变形注意生成对抗网络(FDA GAN)。这种方法只需使用少量人体图像就能传输复杂的人体运动视频。首先,我们使用三维姿态形状估计器代替传统的二维姿态估计器来解决不平滑运动的问题。然后,为了解决伪影问题,我们设计了一种新的注意力机制,并将其与 GAN 相结合,提出了一种能够有效提取图像特征并生成人体运动视频的新网络。最后,为了进一步传递源人类的风格,我们提出了双流风格损失,从而增强了模型的学习能力。实验结果表明,所提出的方法在整体性能和各种评价指标上都优于近期的方法。项目页面:https://github.com/mioyeah/FDA-GAN.
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Filter-deform attention GAN: constructing human motion videos from few images

Filter-deform attention GAN: constructing human motion videos from few images

Human motion transfer is challenging due to the complexity and diversity of human motion and clothing textures. Existing methods use 2D pose estimation to obtain poses, which can easily lead to unsmooth motion and artifacts. Therefore, this paper proposes a highly robust motion transmission model based on image deformation, called the Filter-Deform Attention Generative Adversarial Network (FDA GAN). This method can transmit complex human motion videos using only few human images. First, we use a 3D pose shape estimator instead of traditional 2D pose estimation to address the problem of unsmooth motion. Then, to tackle the artifact problem, we design a new attention mechanism and integrate it with the GAN, proposing a new network capable of effectively extracting image features and generating human motion videos. Finally, to further transfer the style of the source human, we propose a two-stream style loss, which enhances the model’s learning ability. Experimental results demonstrate that the proposed method outperforms recent methods in overall performance and various evaluation metrics. Project page: https://github.com/mioyeah/FDA-GAN.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信