Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI:10.1109/ICCV48922.2021.01103

A. Sengupta, Ignas Budvytis, R. Cipolla

{"title":"Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild","authors":"A. Sengupta, Ignas Budvytis, R. Cipolla","doi":"10.1109/ICCV48922.2021.01103","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of 3D human body shape and pose estimation from an RGB image. This is often an ill-posed problem, since multiple plausible 3D bodies may match the visual evidence present in the input - particularly when the subject is occluded. Thus, it is desirable to estimate a distribution over 3D body shape and pose conditioned on the input image instead of a single 3D re-construction. We train a deep neural network to estimate a hierarchical matrix-Fisher distribution over relative 3D joint rotation matrices (i.e. body pose), which exploits the human body’s kinematic tree structure, as well as a Gaussian distribution over SMPL body shape parameters. To further ensure that the predicted shape and pose distributions match the visual evidence in the input image, we implement a differentiable rejection sampler to impose a reprojection loss between ground-truth 2D joint coordinates and samples from the predicted distributions, projected onto the image plane. We show that our method is competitive with the state-of-the-art in terms of 3D shape and pose metrics on the SSP-3D and 3DPW datasets, while also yielding a structured probability distribution over 3D body shape and pose, with which we can meaningfully quantify prediction uncertainty and sample multiple plausible 3D reconstructions to explain a given input image.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"62 1","pages":"11199-11209"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV48922.2021.01103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

This paper addresses the problem of 3D human body shape and pose estimation from an RGB image. This is often an ill-posed problem, since multiple plausible 3D bodies may match the visual evidence present in the input - particularly when the subject is occluded. Thus, it is desirable to estimate a distribution over 3D body shape and pose conditioned on the input image instead of a single 3D re-construction. We train a deep neural network to estimate a hierarchical matrix-Fisher distribution over relative 3D joint rotation matrices (i.e. body pose), which exploits the human body’s kinematic tree structure, as well as a Gaussian distribution over SMPL body shape parameters. To further ensure that the predicted shape and pose distributions match the visual evidence in the input image, we implement a differentiable rejection sampler to impose a reprojection loss between ground-truth 2D joint coordinates and samples from the predicted distributions, projected onto the image plane. We show that our method is competitive with the state-of-the-art in terms of 3D shape and pose metrics on the SSP-3D and 3DPW datasets, while also yielding a structured probability distribution over 3D body shape and pose, with which we can meaningfully quantify prediction uncertainty and sample multiple plausible 3D reconstructions to explain a given input image.

查看原文本刊更多论文

基于层次运动概率分布的野外三维人体形状和姿态估计

本文研究了基于RGB图像的三维人体形状和姿态估计问题。这通常是一个不适定的问题，因为多个看似合理的3D物体可能与输入中的视觉证据相匹配——特别是当主体被遮挡时。因此，期望在输入图像的条件下估计三维身体形状和姿势的分布，而不是单一的三维重建。我们训练了一个深度神经网络来估计相对3D关节旋转矩阵(即身体姿势)上的分层矩阵- fisher分布，该分布利用了人体的运动学树结构以及SMPL身体形状参数上的高斯分布。为了进一步确保预测的形状和姿态分布与输入图像中的视觉证据相匹配，我们实现了一个可微抑制采样器，以在真实2D关节坐标和预测分布的样本之间施加重投影损失，投影到图像平面上。我们表明，我们的方法在SSP-3D和3DPW数据集上的3D形状和姿态指标方面与最先进的方法具有竞争力，同时也产生了3D形状和姿态的结构化概率分布，我们可以有意义地量化预测不确定性，并对多个合理的3D重建进行采样，以解释给定的输入图像。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量