Skeleton-aware Graph-based Adversarial Networks for Human Pose Estimation from Sparse IMUs

IF 6 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-05-29 DOI:10.1145/3669904

Kaixin Chen, Lin Zhang, Zhong Wang, Shengjie Zhao, Yicong Zhou

{"title":"Skeleton-aware Graph-based Adversarial Networks for Human Pose Estimation from Sparse IMUs","authors":"Kaixin Chen, Lin Zhang, Zhong Wang, Shengjie Zhao, Yicong Zhou","doi":"10.1145/3669904","DOIUrl":null,"url":null,"abstract":"<p>Recently, sparse-inertial human pose estimation (SI-HPE) with only a few IMUs has shown great potential in various fields. The most advanced work in this area achieved fairish results using only six IMUs. However, there are still two major issues that remain to be addressed. First, existing methods typically treat SI-HPE as a temporal sequential learning problem and often ignore the important spatial prior of skeletal topology. Second, there are far more synthetic data in their training data than real data, and the data distribution of synthetic data and real data is quite different, which makes it difficult for the model to be applied to more diverse real data. To address these issues, we propose “Graph-based Adversarial Inertial Poser (GAIP)”, which tracks body movements using sparse data from six IMUs. To make full use of the spatial prior, we design a multi-stage pose regressor with graph convolution to explicitly learn the skeletal topology. A joint position loss is also introduced to implicitly mine spatial information. To enhance the generalization ability, we propose supervising the pose regression with an adversarial loss from a discriminator, bringing the ability of adversarial networks to learn implicit constraints into full play. Additionally, we construct a real dataset that includes hip support movements and a synthetic dataset containing various motion categories to enrich the diversity of inertial data for SI-HPE. Extensive experiments demonstrate that GAIP produces results with more precise limb movement amplitudes and relative joint positions, accompanied by smaller joint angle and position errors compared to state-of-the-art counterparts. The datasets and codes are publicly available at https://cslinzhang.github.io/GAIP/.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"63 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3669904","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, sparse-inertial human pose estimation (SI-HPE) with only a few IMUs has shown great potential in various fields. The most advanced work in this area achieved fairish results using only six IMUs. However, there are still two major issues that remain to be addressed. First, existing methods typically treat SI-HPE as a temporal sequential learning problem and often ignore the important spatial prior of skeletal topology. Second, there are far more synthetic data in their training data than real data, and the data distribution of synthetic data and real data is quite different, which makes it difficult for the model to be applied to more diverse real data. To address these issues, we propose “Graph-based Adversarial Inertial Poser (GAIP)”, which tracks body movements using sparse data from six IMUs. To make full use of the spatial prior, we design a multi-stage pose regressor with graph convolution to explicitly learn the skeletal topology. A joint position loss is also introduced to implicitly mine spatial information. To enhance the generalization ability, we propose supervising the pose regression with an adversarial loss from a discriminator, bringing the ability of adversarial networks to learn implicit constraints into full play. Additionally, we construct a real dataset that includes hip support movements and a synthetic dataset containing various motion categories to enrich the diversity of inertial data for SI-HPE. Extensive experiments demonstrate that GAIP produces results with more precise limb movement amplitudes and relative joint positions, accompanied by smaller joint angle and position errors compared to state-of-the-art counterparts. The datasets and codes are publicly available at https://cslinzhang.github.io/GAIP/.

查看原文本刊更多论文

基于骨架感知图的逆向网络，用于从稀疏 IMUs 估算人体姿态

最近，仅使用几个 IMU 的稀疏惯性人体姿态估计（SI-HPE）在各个领域都显示出巨大的潜力。该领域最先进的工作仅使用六个 IMU 就取得了相当不错的结果。不过，仍有两大问题有待解决。首先，现有方法通常将 SI-HPE 视为时间序列学习问题，往往忽略了骨骼拓扑这一重要的空间先验问题。其次，其训练数据中的合成数据远多于真实数据，而且合成数据与真实数据的数据分布差异较大，这使得模型难以应用于更多样化的真实数据。为了解决这些问题，我们提出了 "基于图的对抗惯性抛物模型（GAIP）"，利用六个 IMU 的稀疏数据来跟踪身体运动。为了充分利用空间先验，我们设计了一个多阶段姿势回归器，利用图卷积明确学习骨骼拓扑结构。我们还引入了联合位置损失来隐式挖掘空间信息。为了增强泛化能力，我们建议使用来自判别器的对抗损失来监督姿势回归，从而充分发挥对抗网络学习隐式约束的能力。此外，我们还构建了一个包含髋部支撑运动的真实数据集和一个包含各种运动类别的合成数据集，以丰富 SI-HPE 惯性数据的多样性。广泛的实验证明，GAIP 得出的结果具有更精确的肢体运动幅度和相对关节位置，与最先进的同行相比，关节角度和位置误差更小。数据集和代码可在 https://cslinzhang.github.io/GAIP/ 公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Multimedia Computing Communications and Applications 工程技术-计算机：理论方法

CiteScore

8.50

自引率

5.90%

发文量

285

审稿时长

7.5 months

期刊介绍： The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.