基于几何自监督的无监督三维动物典型姿态估计

Xiaowei Dai, Shuiwang Li, Qijun Zhao, Hongyu Yang
{"title":"基于几何自监督的无监督三维动物典型姿态估计","authors":"Xiaowei Dai, Shuiwang Li, Qijun Zhao, Hongyu Yang","doi":"10.1109/FG57933.2023.10042785","DOIUrl":null,"url":null,"abstract":"Although analyzing animal shape and pose has potential applications in many fields, there is little work on 3D animal pose estimation. This can be attributed to two aspects: the lack of large-scale well-annotated datasets, and perspective ambiguities which make it difficult to map 2D space to 3D space. To address data scarcity, we propose an unsupervised method to estimate 3D animal pose, given only 2D poses. To deal with perspective ambiguities, we introduce a canonical consistency loss and a camera consistency loss to impose geometric priors in the training process, and combine the reprojection loss and the 2D pose discriminator to enable self-supervised learning. Specifically, given a 2D pose, the pose generator network generates a corresponding 3D pose and the camera network estimates a camera rotation. During training, the generated 3D pose is randomly reprojected onto camera viewpoints to synthesize a new 2D pose. The synthesized 2D pose is decomposed into a 3D pose and a camera rotation, based on which consistency losses are imposed in both 3D canonical poses and camera rotations for self-supervised training. We evaluate the proposed method on real and synthetic datasets, i.e., SMAL and AcinoSet. The experimental results demonstrate the effectiveness of the proposed method and we achieve state-of-the-art performance among unsupervised algorithms for 3D animal canonical pose estimation.","PeriodicalId":318766,"journal":{"name":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Unsupervised 3D Animal Canonical Pose Estimation with Geometric Self-Supervision\",\"authors\":\"Xiaowei Dai, Shuiwang Li, Qijun Zhao, Hongyu Yang\",\"doi\":\"10.1109/FG57933.2023.10042785\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although analyzing animal shape and pose has potential applications in many fields, there is little work on 3D animal pose estimation. This can be attributed to two aspects: the lack of large-scale well-annotated datasets, and perspective ambiguities which make it difficult to map 2D space to 3D space. To address data scarcity, we propose an unsupervised method to estimate 3D animal pose, given only 2D poses. To deal with perspective ambiguities, we introduce a canonical consistency loss and a camera consistency loss to impose geometric priors in the training process, and combine the reprojection loss and the 2D pose discriminator to enable self-supervised learning. Specifically, given a 2D pose, the pose generator network generates a corresponding 3D pose and the camera network estimates a camera rotation. During training, the generated 3D pose is randomly reprojected onto camera viewpoints to synthesize a new 2D pose. The synthesized 2D pose is decomposed into a 3D pose and a camera rotation, based on which consistency losses are imposed in both 3D canonical poses and camera rotations for self-supervised training. We evaluate the proposed method on real and synthetic datasets, i.e., SMAL and AcinoSet. The experimental results demonstrate the effectiveness of the proposed method and we achieve state-of-the-art performance among unsupervised algorithms for 3D animal canonical pose estimation.\",\"PeriodicalId\":318766,\"journal\":{\"name\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"volume\":\"134 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FG57933.2023.10042785\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FG57933.2023.10042785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

虽然分析动物的形状和姿态在许多领域都有潜在的应用,但在三维动物姿态估计方面的工作很少。这可以归因于两个方面:缺乏大规模的良好注释的数据集,以及视角模糊,这使得很难将2D空间映射到3D空间。为了解决数据稀缺问题,我们提出了一种仅给定2D姿态的无监督方法来估计3D动物姿态。为了解决视角模糊问题,我们在训练过程中引入规范一致性损失和摄像机一致性损失来施加几何先验,并将重投影损失和2D姿态鉴别器结合起来实现自监督学习。具体来说,给定一个2D姿态,姿态生成器网络生成一个相应的3D姿态,摄像机网络估计一个摄像机旋转。在训练过程中,生成的3D姿态随机重新投影到摄像机视点上,以合成新的2D姿态。将合成的二维姿态分解为三维姿态和摄像机旋转,在此基础上对三维规范姿态和摄像机旋转施加一致性损失,进行自监督训练。我们在真实和合成数据集(即small和AcinoSet)上对所提出的方法进行了评估。实验结果证明了该方法的有效性,并且在3D动物典型姿态估计的无监督算法中达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unsupervised 3D Animal Canonical Pose Estimation with Geometric Self-Supervision
Although analyzing animal shape and pose has potential applications in many fields, there is little work on 3D animal pose estimation. This can be attributed to two aspects: the lack of large-scale well-annotated datasets, and perspective ambiguities which make it difficult to map 2D space to 3D space. To address data scarcity, we propose an unsupervised method to estimate 3D animal pose, given only 2D poses. To deal with perspective ambiguities, we introduce a canonical consistency loss and a camera consistency loss to impose geometric priors in the training process, and combine the reprojection loss and the 2D pose discriminator to enable self-supervised learning. Specifically, given a 2D pose, the pose generator network generates a corresponding 3D pose and the camera network estimates a camera rotation. During training, the generated 3D pose is randomly reprojected onto camera viewpoints to synthesize a new 2D pose. The synthesized 2D pose is decomposed into a 3D pose and a camera rotation, based on which consistency losses are imposed in both 3D canonical poses and camera rotations for self-supervised training. We evaluate the proposed method on real and synthetic datasets, i.e., SMAL and AcinoSet. The experimental results demonstrate the effectiveness of the proposed method and we achieve state-of-the-art performance among unsupervised algorithms for 3D animal canonical pose estimation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信