HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition

Marsil Zakour, Alaeddine Mellouli, R. Chaudhari
{"title":"HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition","authors":"Marsil Zakour, Alaeddine Mellouli, R. Chaudhari","doi":"10.1109/RO-MAN50785.2021.9515349","DOIUrl":null,"url":null,"abstract":"Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, \"HOIsim\", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities \"lunch\" and \"breakfast\". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"1124-1131"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN50785.2021.9515349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Correct understanding of human activities is critical for meaningful assistance by robots in daily life. The development of perception algorithms and Deep Learning models of human activity requires large-scale sensor datasets. Good real-world activity data is, however, difficult and time- consuming to acquire. Several precisely calibrated and time- synchronized sensors are required, and the annotation and labeling of the collected sensor data is extremely labor intensive.To address these challenges, we present a 3D activity simulator, "HOIsim", focusing on Human-Object Interactions (HOIs). Using HOIsim, we provide a procedurally generated synthetic dataset of two sample daily life activities "lunch" and "breakfast". The dataset contains out-of-the-box ground truth annotations in the form of human and object poses, as well as ground truth activity labels. Furthermore, we introduce methods to meaningfully randomize activity flows and the environment topology. This allows us to generate a large number of random variants of these activities in very less time.Based on an abstraction of the low-level pose data in the form of spatiotemporal graphs of HOIs, we evaluate the generated Lunch dataset only with two Deep Learning models for activity recognition. The first model, based on recurrent neural networks achieves an accuracy of 87%, whereas the other, based on transformers, achieves an accuracy of 94.7%.
HOIsim:合成逼真的三维人-物交互数据,用于人类活动识别
正确理解人类活动对于机器人在日常生活中提供有意义的帮助至关重要。人类活动的感知算法和深度学习模型的发展需要大规模的传感器数据集。然而,获得真实世界的活动数据是困难且耗时的。需要几个精确校准和时间同步的传感器,并且对收集到的传感器数据进行注释和标记是极其劳动密集型的。为了应对这些挑战,我们提出了一个3D活动模拟器,“HOIsim”,专注于人机交互(HOIs)。使用HOIsim,我们提供了一个程序生成的两种样本日常生活活动“午餐”和“早餐”的合成数据集。该数据集包含开箱即用的人类和物体姿势形式的地面真相注释,以及地面真相活动标签。此外,我们引入了有意义的随机化活动流和环境拓扑的方法。这使我们能够在非常短的时间内生成这些活动的大量随机变体。基于以hoi时空图形式抽象的低级姿态数据,我们仅使用两种深度学习模型来评估生成的午餐数据集。基于递归神经网络的第一个模型的准确率为87%,而基于变压器的另一个模型的准确率为94.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信