基于模型的多模态交互获取与分析，改善人机交互

Proceedings of the Symposium on Eye Tracking Research and Applications Pub Date : 2014-03-26 DOI:10.1145/2578153.2582176

Patrick Renner, Thies Pfeiffer

{"title":"基于模型的多模态交互获取与分析，改善人机交互","authors":"Patrick Renner, Thies Pfeiffer","doi":"10.1145/2578153.2582176","DOIUrl":null,"url":null,"abstract":"For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.","PeriodicalId":142459,"journal":{"name":"Proceedings of the Symposium on Eye Tracking Research and Applications","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Model-based acquisition and analysis of multimodal interactions for improving human-robot interaction\",\"authors\":\"Patrick Renner, Thies Pfeiffer\",\"doi\":\"10.1145/2578153.2582176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.\",\"PeriodicalId\":142459,\"journal\":{\"name\":\"Proceedings of the Symposium on Eye Tracking Research and Applications\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Symposium on Eye Tracking Research and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2578153.2582176\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Symposium on Eye Tracking Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2578153.2582176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

为了在与机器人的密切互动中合作解决复杂的任务，他们需要理解人类的自然交流。为了实现这一目标，机器人可以从对人类用于成功沟通的过程的更深入理解中受益。这些技能可以通过调查人类在复杂任务中的面对面互动来研究。在我们的工作中，重点在于路径规划任务中的共享空间交互，因此3D凝视方向和手部运动特别有趣。然而，对注视和手势的分析是一项耗时的任务:通常需要对眼动仪的场景摄像机视频进行逐帧的手动注释。为了解决这个问题，基于EyeSee3D方法，提出了一种自动注释交互的方法:几何建模和3D标记跟踪的结合，用于将真实世界的刺激与虚拟代理对齐。这是基于移动眼动仪的场景摄像头图像完成的。除了eyeesee3d方法外，人脸检测还用于自动检测对话者的注视。对于手势的获取，集成了一个光学标记跟踪系统，并融合在交流情境的多模态表示中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Model-based acquisition and analysis of multimodal interactions for improving human-robot interaction

For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Symposium on Eye Tracking Research and Applications

自引率

0.00%

发文量