SkeletonGAN: Fine-Grained Pose Synthesis of Human-Object Interactions

Qixuan Sun, Nanxi Chen, Ruipeng Zhang, Jiamao Li, Xiaolin Zhang
{"title":"SkeletonGAN: Fine-Grained Pose Synthesis of Human-Object Interactions","authors":"Qixuan Sun, Nanxi Chen, Ruipeng Zhang, Jiamao Li, Xiaolin Zhang","doi":"10.1145/3589572.3589579","DOIUrl":null,"url":null,"abstract":"Synthesizing Human-Object Interactions (HOI) is a challenging problem since the human body has a complex and versatile representation. Existing solutions can generate individual objects or faces very well but still face difficulty in generating realistic human bodies and their interaction with multiple objects. In this work, we focus on synthesizing human poses based on HOI descriptive triplets and introduce a novel perspective that decomposes every action between humans and objects into sub-actions of human body parts to generate body poses in a fine-grained way. We propose SkeletonGAN, a conditional generative adversarial model to perform a body-parts-level control over the interaction between humans and objects. SkeletonGAN is trained and evaluated using the HICO-DET dataset, which is a knowledge base consisting of complex interaction poses of various human-object actions in realistic scenarios. We show through qualitative and quantitative evaluations that this model is capable of generating diverse and plausible poses consistent with the given semantic features, and especially our model can also predict the relative position of the object with the body pose. We also explore synthesizing composite poses that include co-occurring human actions, indicating that the model can learn multimodal relationships between human poses and the given conditional semantic features.","PeriodicalId":296325,"journal":{"name":"Proceedings of the 2023 6th International Conference on Machine Vision and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 6th International Conference on Machine Vision and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3589572.3589579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Synthesizing Human-Object Interactions (HOI) is a challenging problem since the human body has a complex and versatile representation. Existing solutions can generate individual objects or faces very well but still face difficulty in generating realistic human bodies and their interaction with multiple objects. In this work, we focus on synthesizing human poses based on HOI descriptive triplets and introduce a novel perspective that decomposes every action between humans and objects into sub-actions of human body parts to generate body poses in a fine-grained way. We propose SkeletonGAN, a conditional generative adversarial model to perform a body-parts-level control over the interaction between humans and objects. SkeletonGAN is trained and evaluated using the HICO-DET dataset, which is a knowledge base consisting of complex interaction poses of various human-object actions in realistic scenarios. We show through qualitative and quantitative evaluations that this model is capable of generating diverse and plausible poses consistent with the given semantic features, and especially our model can also predict the relative position of the object with the body pose. We also explore synthesizing composite poses that include co-occurring human actions, indicating that the model can learn multimodal relationships between human poses and the given conditional semantic features.
骷髅:人与物体交互的细粒度姿势合成
由于人体具有复杂多变的表征,人-物交互的合成是一个具有挑战性的问题。现有的解决方案可以很好地生成单个物体或人脸,但在生成逼真的人体及其与多个物体的交互方面仍然面临困难。在这项工作中,我们专注于基于HOI描述三联体的人体姿势合成,并引入了一种新的视角,将人与物体之间的每个动作分解为人体部位的子动作,以细粒度的方式生成身体姿势。我们提出了一个条件生成对抗模型,用于对人与物体之间的交互进行身体-部位级控制。使用HICO-DET数据集对骷髅进行训练和评估,该数据集是一个知识库,由现实场景中各种人类物体动作的复杂交互姿势组成。我们通过定性和定量评估表明,该模型能够生成与给定语义特征一致的多种合理姿势,特别是我们的模型还可以预测物体与身体姿势的相对位置。我们还探索了包括共同发生的人类动作的合成复合姿势,表明该模型可以学习人类姿势和给定条件语义特征之间的多模态关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信