3D Human Pose Estimation: Using Context Information in Monocular Video

2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI) Pub Date : 2021-08-01 DOI:10.1109/ICCEAI52939.2021.00001

Yuan-yuan Zhou, Xiaoyan Hu

{"title":"3D Human Pose Estimation: Using Context Information in Monocular Video","authors":"Yuan-yuan Zhou, Xiaoyan Hu","doi":"10.1109/ICCEAI52939.2021.00001","DOIUrl":null,"url":null,"abstract":"We propose a context-based two-stage 3D human pose estimation network structure. The first stage is to obtain the 2D human pose and 2D key-points in the video stream data, this stage is crucial to the subsequent work and the entire process. By analyzing the limitations and shortcomings of existing models, we proposed a context-based human pose estimation network structure, and incorporate the BILSTM structure into the pose machine method. In our model, Invisible key-points can be jointly predicted by human pose in current frame and context information. Through quantification and visualization experiments, we have proved that it has a good mitigating effect on the invisible key points caused by occlusion and the wrong linking of human key-points. In the second stage, the 3D human pose is obtained through sparse representation and 3D reconstruction. The experimental results show that the method we designed has higher accuracy than the existing human body pose estimation method of video streaming, and has better performance in the occlusion problem.","PeriodicalId":331409,"journal":{"name":"2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEAI52939.2021.00001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

We propose a context-based two-stage 3D human pose estimation network structure. The first stage is to obtain the 2D human pose and 2D key-points in the video stream data, this stage is crucial to the subsequent work and the entire process. By analyzing the limitations and shortcomings of existing models, we proposed a context-based human pose estimation network structure, and incorporate the BILSTM structure into the pose machine method. In our model, Invisible key-points can be jointly predicted by human pose in current frame and context information. Through quantification and visualization experiments, we have proved that it has a good mitigating effect on the invisible key points caused by occlusion and the wrong linking of human key-points. In the second stage, the 3D human pose is obtained through sparse representation and 3D reconstruction. The experimental results show that the method we designed has higher accuracy than the existing human body pose estimation method of video streaming, and has better performance in the occlusion problem.

查看原文本刊更多论文

三维人体姿态估计:在单目视频中使用上下文信息

提出了一种基于上下文的两阶段三维人体姿态估计网络结构。第一个阶段是获取视频流数据中的二维人体姿态和二维关键点，这一阶段对后续工作和整个过程至关重要。通过分析现有模型的局限性和不足，提出了一种基于上下文的人体姿态估计网络结构，并将BILSTM结构纳入姿态机方法。在我们的模型中，不可见的关键点可以通过当前帧和上下文信息中的人体姿态来联合预测。通过量化和可视化实验，我们证明了该方法对遮挡和人类关键点错误链接造成的关键点不可见有很好的缓解效果。第二阶段，通过稀疏表示和三维重构得到三维人体姿态。实验结果表明，我们设计的方法比现有的视频流人体姿态估计方法具有更高的精度，并且在遮挡问题上具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)

自引率

0.00%

发文量