{"title":"基于模型的多模态交互获取与分析,改善人机交互","authors":"Patrick Renner, Thies Pfeiffer","doi":"10.1145/2578153.2582176","DOIUrl":null,"url":null,"abstract":"For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.","PeriodicalId":142459,"journal":{"name":"Proceedings of the Symposium on Eye Tracking Research and Applications","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Model-based acquisition and analysis of multimodal interactions for improving human-robot interaction\",\"authors\":\"Patrick Renner, Thies Pfeiffer\",\"doi\":\"10.1145/2578153.2582176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.\",\"PeriodicalId\":142459,\"journal\":{\"name\":\"Proceedings of the Symposium on Eye Tracking Research and Applications\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Symposium on Eye Tracking Research and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2578153.2582176\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Symposium on Eye Tracking Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2578153.2582176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Model-based acquisition and analysis of multimodal interactions for improving human-robot interaction
For solving complex tasks cooperatively in close interaction with robots, they need to understand natural human communication. To achieve this, robots could benefit from a deeper understanding of the processes that humans use for successful communication. Such skills can be studied by investigating human face-to-face interactions in complex tasks. In our work the focus lies on shared-space interactions in a path planning task and thus 3D gaze directions and hand movements are of particular interest. However, the analysis of gaze and gestures is a time-consuming task: Usually, manual annotation of the eye tracker's scene camera video is necessary in a frame-by-frame manner. To tackle this issue, based on the EyeSee3D method, an automatic approach for annotating interactions is presented: A combination of geometric modeling and 3D marker tracking serves to align real world stimuli with virtual proxies. This is done based on the scene camera images of the mobile eye tracker alone. In addition to the EyeSee3D approach, face detection is used to automatically detect fixations on the interlocutor. For the acquisition of the gestures, an optical marker tracking system is integrated and fused in the multimodal representation of the communicative situation.