GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535956

Misato Yatsushiro, Naoya Ikeda, Yuki Hayashi, Y. Nakano

引用次数: 0

The acoustics of eye contact: detecting visual attention from conversational audio cues 目光接触的声学:从对话音频线索中检测视觉注意力

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535949

F. Eyben, F. Weninger, L. Paletta, Björn Schuller

{"title":"The acoustics of eye contact: detecting visual attention from conversational audio cues","authors":"F. Eyben, F. Weninger, L. Paletta, Björn Schuller","doi":"10.1145/2535948.2535949","DOIUrl":"https://doi.org/10.1145/2535948.2535949","url":null,"abstract":"An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speaker's voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131855947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Learning aspects of interest from Gaze 从凝视中学习兴趣的各个方面

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535955

Kei Shimonishi, H. Kawashima, Ryo Yonetani, Erina Ishikawa, T. Matsuyama

引用次数: 4

Mutual disambiguation of eye gaze and speech for sight translation and reading 视觉翻译与阅读中目光与言语的相互消歧

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535953

Rucha Kulkarni, Kritika Jain, H. Bansal, S. Bangalore, M. Carl

引用次数: 2

Situated multi-modal dialog system in vehicles 位于车辆中的多模态对话系统

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535951

Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta

引用次数: 22

Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation agent辅助的多视点视频查看器及其基于注视的评价

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535952

Takatsugu Hirayama, Takafumi Marutani, Daishi Tanoue, Shogo Tokai, S. Fels, K. Mase

{"title":"Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation","authors":"Takatsugu Hirayama, Takafumi Marutani, Daishi Tanoue, Shogo Tokai, S. Fels, K. Mase","doi":"10.1145/2535948.2535952","DOIUrl":"https://doi.org/10.1145/2535948.2535952","url":null,"abstract":"Humans see things from various viewpoints but nobody attempts to see anything from every viewpoint owing to physical restrictions and the great effort required. Intelligent interfaces for viewing multi-viewpoint videos may remove the restrictions in effective ways and direct us toward a new visual world. We propose an agent-assisted multi-viewpoint video viewer that incorporates (1) target-centered viewpoint switching and (2) social viewpoint recommendation. The viewer stabilizes an object at the center of the display field using the former function, which helps to fix the user's gaze on the target object. To identify the popular viewing behavior for particular content, the latter function exploits a histogram of the viewing log in terms of time, viewpoints, and the target of many personal viewing experiences. We call this knowledge source of the director agent a viewgram. The agent automatically constructs the preferred viewpoint sequence for each target. We conducted user studies to analyze user behavior, especially eye movement, while using the viewer. The results of statistical analyses showed that the viewpoint sequence extracted from a viewgram includes a more distinct perspective for each target, and the target-centered viewpoint switching encourages the user to gaze at the display center where the target is located during the viewing. The proposed viewer can provide more effective perspectives for the main attractions in scenes.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"35 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120971324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Finding the timings for a guide agent to interveneinter-user conversation in considering their gazebehaviors 在考虑用户的注视行为时，寻找引导代理干预用户间对话的时机

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535957

Shochi Otogi, Hung-Hsuan Huang, R. Hotta, K. Kawagoe

{"title":"Finding the timings for a guide agent to interveneinter-user conversation in considering their gazebehaviors","authors":"Shochi Otogi, Hung-Hsuan Huang, R. Hotta, K. Kawagoe","doi":"10.1145/2535948.2535957","DOIUrl":"https://doi.org/10.1145/2535948.2535957","url":null,"abstract":"As the advance of embodied conversational agent (ECA) technologies, there are more and more real-world deployed applications of ECA's like the guides in museums or exhibitions. However, in those situations, the agent systems are usually used by groups of visitors rather than individuals. In such multi-user situation which is much more complex than single user one, specific features are required. One of them is the ability for the agent to smoothly intervene user-user conversation. This feature is supposed to facilitate mixed-initiative human-agent conversation and more proactive service for the users. This paper presents the results of the first step of our project that aims to build an information providing the agent for collaborative decision making tasks, finding the timings for the agent to intervene user-user conversation to provide active support by focusing on the user's gaze. In order to realize this, at first, a Wizard-of- Oz (WOZ) experiment was conducted for collecting human interaction data. By analyzing the collected corpus, eight kinds of timings which allow the agent to do intervention potentially were found. Second, a method was developed to automatically identify four of the eight kinds of timings only by using nonverbal cues, gaze direction, body posture, and speech information. Although the performance of the method is moderate (F-measure 0.4), it should be able to be improved by integrating context information in the future.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133100513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Unrawelling the interaction strategies and gaze in collaborative learning with online video lectures 通过网络视频讲座，揭示协作学习中的互动策略与凝视

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535959

R. Bednarik, Marko Kauppinen

引用次数: 2

Context aware addressee estimation for human robot interaction 人机交互中上下文感知的收件人估计

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535958

Samira Sheikhi, D. Jayagopi, Vasil Khalidov, J. Odobez

{"title":"Context aware addressee estimation for human robot interaction","authors":"Samira Sheikhi, D. Jayagopi, Vasil Khalidov, J. Odobez","doi":"10.1145/2535948.2535958","DOIUrl":"https://doi.org/10.1145/2535948.2535958","url":null,"abstract":"The paper investigates the problem of addressee recognition -to whom a speaker's utterance is intended- in a setting involving a humanoid robot interacting with multiple persons. More specifically, as it is well known that addressee can primarily be derived from the speaker's visual focus of attention (VFOA) defined as whom or what a person is looking at, we address the following questions: how much does the performance degrade when using automatically extracted VFOA from head pose instead of the VFOA ground-truth? Can the conversational context improve addressee recognition by using it either directly as a side cue in the addressee classifier, or indirectly by improving the VFOA recognition, or in both ways? Finally, from a computational perspective, which VFOA features and normalizations are better and does it matter whether the VFOA recognition module only monitors whether a person looks at potential addressee targets (the robot, people) or if it also considers objects of interest in the environment (paintings in our case) as additional VFOA targets? Experiments on the public Vernissage database where the humanoid Nao robots make a quiz to two participants shows that reducing VFOA confusion (either through context, or by ignoring VFOA targets) improves addressee recognition.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128606966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Feature selection for gaze, pupillary, and EEG signals evoked in a 3D environment 三维环境下凝视、瞳孔和脑电图信号的特征选择

GazeIn '13 Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535950

D. Jangraw, P. Sajda

引用次数: 7

GazeIn '13最新文献