HiSC4D: Human-Centered Interaction and 4D Scene Capture in Large-Scale Space Using Wearable IMUs and LiDAR

Yudi Dai;Zhiyong Wang;Xiping Lin;Chenglu Wen;Lan Xu;Siqi Shen;Yuexin Ma;Cheng Wang
{"title":"HiSC4D: Human-Centered Interaction and 4D Scene Capture in Large-Scale Space Using Wearable IMUs and LiDAR","authors":"Yudi Dai;Zhiyong Wang;Xiping Lin;Chenglu Wen;Lan Xu;Siqi Shen;Yuexin Ma;Cheng Wang","doi":"10.1109/TPAMI.2024.3457229","DOIUrl":null,"url":null,"abstract":"We introduce HiSC4D, a novel \n<b>H</b>\numan-centered \n<b>i</b>\nnteraction and \n<b>4D</b>\n \n<b>S</b>\ncene \n<b>C</b>\napture method, aimed at accurately and efficiently creating a dynamic digital world, containing large-scale indoor-outdoor scenes, diverse human motions, rich human-human interactions, and human-environment interactions. By utilizing body-mounted IMUs and a head-mounted LiDAR, HiSC4D can capture egocentric human motions in unconstrained space without the need for external devices and pre-built maps. This affords great flexibility and accessibility for human-centered interaction and 4D scene capturing in various environments. Taking into account that IMUs can capture human spatially unrestricted poses but are prone to drifting for long-period using, and while LiDAR is stable for global localization but rough for local positions and orientations, HiSC4D employs a joint optimization method, harmonizing all sensors and utilizing environment cues, yielding promising results for long-term capture in large scenes. To promote research of egocentric human interaction in large scenes and facilitate downstream tasks, we also present a dataset, containing 8 sequences in 4 large scenes (200 to 5,000 \n<inline-formula><tex-math>$\\text{m}^{2}$</tex-math></inline-formula>\n), providing 36 k frames of accurate 4D human motions with SMPL annotations and dynamic scenes, 31k frames of cropped human point clouds, and scene mesh of the environment. A variety of scenarios, such as the basketball gym and commercial street, alongside challenging human motions, such as daily greeting, one-on-one basketball playing, and tour guiding, demonstrate the effectiveness and the generalization ability of HiSC4D. The dataset and code will be publicly available for research purposes.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"11236-11253"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10670484/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We introduce HiSC4D, a novel H uman-centered i nteraction and 4D S cene C apture method, aimed at accurately and efficiently creating a dynamic digital world, containing large-scale indoor-outdoor scenes, diverse human motions, rich human-human interactions, and human-environment interactions. By utilizing body-mounted IMUs and a head-mounted LiDAR, HiSC4D can capture egocentric human motions in unconstrained space without the need for external devices and pre-built maps. This affords great flexibility and accessibility for human-centered interaction and 4D scene capturing in various environments. Taking into account that IMUs can capture human spatially unrestricted poses but are prone to drifting for long-period using, and while LiDAR is stable for global localization but rough for local positions and orientations, HiSC4D employs a joint optimization method, harmonizing all sensors and utilizing environment cues, yielding promising results for long-term capture in large scenes. To promote research of egocentric human interaction in large scenes and facilitate downstream tasks, we also present a dataset, containing 8 sequences in 4 large scenes (200 to 5,000 $\text{m}^{2}$ ), providing 36 k frames of accurate 4D human motions with SMPL annotations and dynamic scenes, 31k frames of cropped human point clouds, and scene mesh of the environment. A variety of scenarios, such as the basketball gym and commercial street, alongside challenging human motions, such as daily greeting, one-on-one basketball playing, and tour guiding, demonstrate the effectiveness and the generalization ability of HiSC4D. The dataset and code will be publicly available for research purposes.
HiSC4D:使用可穿戴式 IMU 和激光雷达在大规模空间进行以人为本的交互和 4D 场景捕捉
我们介绍的 HiSC4D 是一种新颖的以人为本的交互和 4D 场景捕捉方法,旨在准确高效地创建一个动态数字世界,其中包含大规模室内外场景、多样化的人体运动、丰富的人与人之间的交互以及人与环境之间的交互。通过利用安装在身体上的 IMU 和头戴式激光雷达,HiSC4D 可以捕捉无约束空间中以自我为中心的人体运动,而无需外部设备和预制地图。这为在各种环境中进行以人为中心的交互和 4D 场景捕捉提供了极大的灵活性和便利性。考虑到 IMUs 可以捕捉人类不受空间限制的姿势,但在长时间使用时容易发生漂移,而 LiDAR 对于全局定位是稳定的,但对于局部位置和方向却很粗糙,HiSC4D 采用了一种联合优化方法,协调所有传感器并利用环境线索,在大型场景的长期捕捉方面取得了可喜的成果。为了促进大场景中以自我为中心的人机交互研究并推动下游任务,我们还提出了一个数据集,其中包含4个大场景(200到5000美元text{m}^{2}$)中的8个序列,提供了36k帧带有SMPL注释和动态场景的精确4D人体运动、31k帧裁剪人体点云和环境场景网格。篮球馆和商业街等各种场景,以及日常问候、一对一篮球比赛和导游等具有挑战性的人体动作,都证明了 HiSC4D 的有效性和泛化能力。数据集和代码将公开用于研究目的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信