基于射频的多人三维姿态估计多视角姿态机

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI:10.1109/ICME55011.2023.00454

Chunyang Xie, Dongheng Zhang, Zhi Wu, Cong Yu, Yang Hu, Qibin Sun, Yan Chen

{"title":"基于射频的多人三维姿态估计多视角姿态机","authors":"Chunyang Xie, Dongheng Zhang, Zhi Wu, Cong Yu, Yang Hu, Qibin Sun, Yan Chen","doi":"10.1109/ICME55011.2023.00454","DOIUrl":null,"url":null,"abstract":"In this paper, we present RF-based Multi-view Pose machine (RF-MvP) for multi-person 3D pose estimation using RF signals. Specifically, we first develop a lightweight anchor-free detector module to locate and crop regions of interest from horizontal and vertical RF signals. Afterward, we propose a Multi-view Fusion Network to unproject the RF signals from the horizontal and vertical millimeter-wave radars into a unified latent space, and then calculate the correlation for weighted fusion. Finally, a Spatio-Temporal Attention Network is designed to reconstruct the multi-person 3D skeleton sequences, in which the spatial attention module is proposed to recover invisible body parts using non-local correlations among joints and the temporal attention module refines the 3D pose sequences using temporal coherency learned from frame queries. We evaluate the performance of the proposed RF-MvP and state-of-the-art methods on a large-scale dataset with multi-person 3D pose labels and corresponding radar signals. The experimental results show that RF-MvP outperforms all of the baseline methods, which locates multi-person 3D key points with an average error of 73mm and generalizes well in new data such as occlusion, low illumination.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RF-based Multi-view Pose Machine for Multi-Person 3D Pose Estimation\",\"authors\":\"Chunyang Xie, Dongheng Zhang, Zhi Wu, Cong Yu, Yang Hu, Qibin Sun, Yan Chen\",\"doi\":\"10.1109/ICME55011.2023.00454\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present RF-based Multi-view Pose machine (RF-MvP) for multi-person 3D pose estimation using RF signals. Specifically, we first develop a lightweight anchor-free detector module to locate and crop regions of interest from horizontal and vertical RF signals. Afterward, we propose a Multi-view Fusion Network to unproject the RF signals from the horizontal and vertical millimeter-wave radars into a unified latent space, and then calculate the correlation for weighted fusion. Finally, a Spatio-Temporal Attention Network is designed to reconstruct the multi-person 3D skeleton sequences, in which the spatial attention module is proposed to recover invisible body parts using non-local correlations among joints and the temporal attention module refines the 3D pose sequences using temporal coherency learned from frame queries. We evaluate the performance of the proposed RF-MvP and state-of-the-art methods on a large-scale dataset with multi-person 3D pose labels and corresponding radar signals. The experimental results show that RF-MvP outperforms all of the baseline methods, which locates multi-person 3D key points with an average error of 73mm and generalizes well in new data such as occlusion, low illumination.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00454\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了基于射频的多视图姿态机(RF- mvp)，用于使用射频信号进行多人三维姿态估计。具体来说，我们首先开发了一种轻量级无锚检测器模块，用于定位和裁剪水平和垂直射频信号中感兴趣的区域。然后，我们提出了一种多视图融合网络，将水平和垂直毫米波雷达的射频信号分离到统一的潜在空间，然后计算相关系数进行加权融合。最后，设计了一个时空注意网络来重建多人三维骨骼序列，其中空间注意模块利用关节之间的非局部相关性来恢复不可见的身体部位，时间注意模块利用从帧查询中学习的时间一致性来细化三维姿势序列。我们在具有多人三维姿态标签和相应雷达信号的大规模数据集上评估了所提出的RF-MvP和最先进的方法的性能。实验结果表明，RF-MvP定位方法优于所有基线方法，平均误差为73mm，对遮挡、低照度等新数据具有良好的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RF-based Multi-view Pose Machine for Multi-Person 3D Pose Estimation

In this paper, we present RF-based Multi-view Pose machine (RF-MvP) for multi-person 3D pose estimation using RF signals. Specifically, we first develop a lightweight anchor-free detector module to locate and crop regions of interest from horizontal and vertical RF signals. Afterward, we propose a Multi-view Fusion Network to unproject the RF signals from the horizontal and vertical millimeter-wave radars into a unified latent space, and then calculate the correlation for weighted fusion. Finally, a Spatio-Temporal Attention Network is designed to reconstruct the multi-person 3D skeleton sequences, in which the spatial attention module is proposed to recover invisible body parts using non-local correlations among joints and the temporal attention module refines the 3D pose sequences using temporal coherency learned from frame queries. We evaluate the performance of the proposed RF-MvP and state-of-the-art methods on a large-scale dataset with multi-person 3D pose labels and corresponding radar signals. The experimental results show that RF-MvP outperforms all of the baseline methods, which locates multi-person 3D key points with an average error of 73mm and generalizes well in new data such as occlusion, low illumination.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Multimedia and Expo (ICME)

自引率

0.00%

发文量