学习人机交互的多模态交互潜在动力学

V. Prasad, Dorothea Koert, R. Stock-Homburg, Jan Peters, G. Chalvatzaki
{"title":"学习人机交互的多模态交互潜在动力学","authors":"V. Prasad, Dorothea Koert, R. Stock-Homburg, Jan Peters, G. Chalvatzaki","doi":"10.1109/Humanoids53995.2022.10000239","DOIUrl":null,"url":null,"abstract":"Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human's actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE). Our experimental evaluations for learning HRI from HHI demonstrations show that MILD effectively captures the multimodality in the latent representations of HRI tasks, allowing us to decode the varying dynamics occurring in such tasks. Compared to related work, MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory. Notably, MILD can learn directly from camera-based pose estimations to generate trajectories, which we then map to a humanoid robot without the need for any additional training. Supplementary Material: https://bit.ly/MILD-HRI.","PeriodicalId":180816,"journal":{"name":"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction\",\"authors\":\"V. Prasad, Dorothea Koert, R. Stock-Homburg, Jan Peters, G. Chalvatzaki\",\"doi\":\"10.1109/Humanoids53995.2022.10000239\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human's actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE). Our experimental evaluations for learning HRI from HHI demonstrations show that MILD effectively captures the multimodality in the latent representations of HRI tasks, allowing us to decode the varying dynamics occurring in such tasks. Compared to related work, MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory. Notably, MILD can learn directly from camera-based pose estimations to generate trajectories, which we then map to a humanoid robot without the need for any additional training. Supplementary Material: https://bit.ly/MILD-HRI.\",\"PeriodicalId\":180816,\"journal\":{\"name\":\"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/Humanoids53995.2022.10000239\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Humanoids53995.2022.10000239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

对交互动力学进行建模以生成机器人轨迹,使机器人能够适应并对人类的行为和意图做出反应,这对于高效和有效的协作人机交互(HRI)至关重要。从演示中学习(LfD)方法从人机交互(HHI)中已经显示出有希望的结果,特别是当与表示学习技术相结合时。然而,这种学习HRI的方法要么不能很好地扩展到高维数据,要么不能准确地适应交互伙伴的姿态变化。我们提出了多模态交互潜在动力学(MILD),这是一种结合深度表示学习和概率机器学习来解决两方物理HRIs问题的方法。我们从演示中学习交互动力学,使用隐半马尔可夫模型(HSMMs)对变分自编码器(VAE)潜在空间中交互代理的联合分布进行建模。我们从HHI演示中学习HRI的实验评估表明,MILD有效地捕获了HRI任务潜在表征中的多模态,使我们能够解码这些任务中发生的变化动态。与相关工作相比,当被控主体(机器人)以观察主体(人)的轨迹为条件时,MILD为被控主体(机器人)生成更准确的轨迹。值得注意的是,MILD可以直接从基于相机的姿态估计中学习来生成轨迹,然后我们将其映射到人形机器人上,而不需要任何额外的训练。补充资料:https://bit.ly/MILD-HRI。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction
Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human's actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE). Our experimental evaluations for learning HRI from HHI demonstrations show that MILD effectively captures the multimodality in the latent representations of HRI tasks, allowing us to decode the varying dynamics occurring in such tasks. Compared to related work, MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory. Notably, MILD can learn directly from camera-based pose estimations to generate trajectories, which we then map to a humanoid robot without the need for any additional training. Supplementary Material: https://bit.ly/MILD-HRI.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信