Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models

Lorenzo Mandelli, Stefano Berretti
{"title":"Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models","authors":"Lorenzo Mandelli, Stefano Berretti","doi":"arxiv-2409.11920","DOIUrl":null,"url":null,"abstract":"In this paper, we address the challenge of generating realistic 3D human\nmotions for action classes that were never seen during the training phase. Our\napproach involves decomposing complex actions into simpler movements,\nspecifically those observed during training, by leveraging the knowledge of\nhuman motion contained in GPTs models. These simpler movements are then\ncombined into a single, realistic animation using the properties of diffusion\nmodels. Our claim is that this decomposition and subsequent recombination of\nsimple movements can synthesize an animation that accurately represents the\ncomplex input action. This method operates during the inference phase and can\nbe integrated with any pre-trained diffusion model, enabling the synthesis of\nmotion classes not present in the training data. We evaluate our method by\ndividing two benchmark human motion datasets into basic and complex actions,\nand then compare its performance against the state-of-the-art.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we address the challenge of generating realistic 3D human motions for action classes that were never seen during the training phase. Our approach involves decomposing complex actions into simpler movements, specifically those observed during training, by leveraging the knowledge of human motion contained in GPTs models. These simpler movements are then combined into a single, realistic animation using the properties of diffusion models. Our claim is that this decomposition and subsequent recombination of simple movements can synthesize an animation that accurately represents the complex input action. This method operates during the inference phase and can be integrated with any pre-trained diffusion model, enabling the synthesis of motion classes not present in the training data. We evaluate our method by dividing two benchmark human motion datasets into basic and complex actions, and then compare its performance against the state-of-the-art.
通过扩散模型的时空组合生成复杂的三维人体运动
在本文中,我们要解决的难题是为训练阶段从未见过的动作类别生成逼真的三维人类动作。我们的方法是利用 GPTs 模型中包含的人类动作知识,将复杂动作分解为更简单的动作,特别是在训练过程中观察到的动作。然后利用扩散模型的特性,将这些较简单的动作组合成单个逼真的动画。我们的主张是,这种简单动作的分解和随后的重组可以合成一个能准确表现复杂输入动作的动画。这种方法在推理阶段运行,可以与任何预先训练好的扩散模型相结合,从而合成训练数据中不存在的动作类别。我们通过将两个基准人类动作数据集分为基本动作和复杂动作来评估我们的方法,然后将其性能与最先进的方法进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信