AMG:阿凡达动作引导视频生成器

Zhangsihao Yang, Mengyi Shan, Mohammad Farazi, Wenhui Zhu, Yanxi Chen, Xuanzhao Dong, Yalin Wang
{"title":"AMG:阿凡达动作引导视频生成器","authors":"Zhangsihao Yang, Mengyi Shan, Mohammad Farazi, Wenhui Zhu, Yanxi Chen, Xuanzhao Dong, Yalin Wang","doi":"arxiv-2409.01502","DOIUrl":null,"url":null,"abstract":"Human video generation task has gained significant attention with the\nadvancement of deep generative models. Generating realistic videos with human\nmovements is challenging in nature, due to the intricacies of human body\ntopology and sensitivity to visual artifacts. The extensively studied 2D media\ngeneration methods take advantage of massive human media datasets, but struggle\nwith 3D-aware control; whereas 3D avatar-based approaches, while offering more\nfreedom in control, lack photorealism and cannot be harmonized seamlessly with\nbackground scene. We propose AMG, a method that combines the 2D photorealism\nand 3D controllability by conditioning video diffusion models on controlled\nrendering of 3D avatars. We additionally introduce a novel data processing\npipeline that reconstructs and renders human avatar movements from dynamic\ncamera videos. AMG is the first method that enables multi-person diffusion\nvideo generation with precise control over camera positions, human motions, and\nbackground style. We also demonstrate through extensive evaluation that it\noutperforms existing human video generation methods conditioned on pose\nsequences or driving videos in terms of realism and adaptability.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"136 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AMG: Avatar Motion Guided Video Generation\",\"authors\":\"Zhangsihao Yang, Mengyi Shan, Mohammad Farazi, Wenhui Zhu, Yanxi Chen, Xuanzhao Dong, Yalin Wang\",\"doi\":\"arxiv-2409.01502\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human video generation task has gained significant attention with the\\nadvancement of deep generative models. Generating realistic videos with human\\nmovements is challenging in nature, due to the intricacies of human body\\ntopology and sensitivity to visual artifacts. The extensively studied 2D media\\ngeneration methods take advantage of massive human media datasets, but struggle\\nwith 3D-aware control; whereas 3D avatar-based approaches, while offering more\\nfreedom in control, lack photorealism and cannot be harmonized seamlessly with\\nbackground scene. We propose AMG, a method that combines the 2D photorealism\\nand 3D controllability by conditioning video diffusion models on controlled\\nrendering of 3D avatars. We additionally introduce a novel data processing\\npipeline that reconstructs and renders human avatar movements from dynamic\\ncamera videos. AMG is the first method that enables multi-person diffusion\\nvideo generation with precise control over camera positions, human motions, and\\nbackground style. We also demonstrate through extensive evaluation that it\\noutperforms existing human video generation methods conditioned on pose\\nsequences or driving videos in terms of realism and adaptability.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"136 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01502\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01502","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着深度生成模型的发展,人类视频生成任务获得了极大关注。由于人体结构的复杂性和对视觉伪影的敏感性,生成逼真的人体动作视频本质上具有挑战性。已被广泛研究的二维媒体生成方法利用了海量人类媒体数据集的优势,但在三维感知控制方面却举步维艰;而基于三维头像的方法虽然提供了更大的控制自由度,但却缺乏逼真度,无法与背景场景无缝协调。我们提出了 AMG 方法,通过在三维头像的控制渲染中调节视频扩散模型,将二维逼真度和三维可控性结合起来。此外,我们还引入了一种新颖的数据处理管道,可从动态摄像机视频中重建和渲染人类头像的动作。AMG 是第一种能精确控制摄像机位置、人体运动和背景风格的多人扩散视频生成方法。我们还通过广泛的评估证明,AMG 在逼真度和适应性方面优于现有的以位置序列或驾驶视频为条件的人体视频生成方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
AMG: Avatar Motion Guided Video Generation
Human video generation task has gained significant attention with the advancement of deep generative models. Generating realistic videos with human movements is challenging in nature, due to the intricacies of human body topology and sensitivity to visual artifacts. The extensively studied 2D media generation methods take advantage of massive human media datasets, but struggle with 3D-aware control; whereas 3D avatar-based approaches, while offering more freedom in control, lack photorealism and cannot be harmonized seamlessly with background scene. We propose AMG, a method that combines the 2D photorealism and 3D controllability by conditioning video diffusion models on controlled rendering of 3D avatars. We additionally introduce a novel data processing pipeline that reconstructs and renders human avatar movements from dynamic camera videos. AMG is the first method that enables multi-person diffusion video generation with precise control over camera positions, human motions, and background style. We also demonstrate through extensive evaluation that it outperforms existing human video generation methods conditioned on pose sequences or driving videos in terms of realism and adaptability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信