ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891969
Sebastian Gorga, K. Otsuka
{"title":"Conversation scene analysis based on dynamic Bayesian network and image-based gaze detection","authors":"Sebastian Gorga, K. Otsuka","doi":"10.1145/1891903.1891969","DOIUrl":"https://doi.org/10.1145/1891903.1891969","url":null,"abstract":"This paper presents a probabilistic framework, which incorporates automatic image-based gaze detection, for inferring the structure of multiparty face-to-face conversations. This framework aims to infer conversation regimes and gaze patterns from the nonverbal behaviors of meeting participants, which are captured from image and audio streams with cameras and microphones. The conversation regime corresponds to a global conversational pattern such as monologue and dialogue, and the gaze pattern indicates \"who is looking at whom\". Input nonverbal behaviors include presence/absence of utterances, head directions, and discrete head-centered eye-gaze directions. In contrast to conventional meeting analysis methods that focus only on the participant's head pose as a surrogate of visual focus of attention, this paper newly incorporates vision-based gaze detection combined with head pose tracking into a probabilistic conversation model based on dynamic Bayesian network. Our gaze detector is able to differentiate 3 to 5 different eye gaze directions, e.g. left, straight and right. Experiments on four-person conversations confirm the power of the proposed framework in identifying conversation structure and in estimating gaze patterns with higher accuracy then previous models.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114206903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891913
B. Lepri, Subramanian Ramanathan, Kyriaki Kalimeri, Jacopo Staiano, F. Pianesi, N. Sebe
{"title":"Employing social gaze and speaking activity for automatic determination of the Extraversion trait","authors":"B. Lepri, Subramanian Ramanathan, Kyriaki Kalimeri, Jacopo Staiano, F. Pianesi, N. Sebe","doi":"10.1145/1891903.1891913","DOIUrl":"https://doi.org/10.1145/1891903.1891913","url":null,"abstract":"In order to predict the Extraversion personality trait, we exploit medium-grained behaviors enacted in group meetings, namely, speaking time and social attention (social gaze). The latter will be further distinguished in to attention given to the group members and attention received from them. The results of our work confirm many of our hypotheses: a) speaking time and (some forms of) social gaze are effective in automatically predicting Extraversion; b) classification accuracy is affected by the size of the time slices used for analysis, and c) to a large extent, the consideration of the social context does not add much to accuracy prediction, with an important exception concerning social gaze.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125428598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891912
Boris Schauerte, G. Fink
{"title":"Focusing computational visual attention in multi-modal human-robot interaction","authors":"Boris Schauerte, G. Fink","doi":"10.1145/1891903.1891912","DOIUrl":"https://doi.org/10.1145/1891903.1891912","url":null,"abstract":"Identifying verbally and non-verbally referred-to objects is an important aspect of human-robot interaction. Most importantly, it is essential to achieve a joint focus of attention and, thus, a natural interaction behavior. In this contribution, we introduce a saliency-based model that reflects how multi-modal referring acts influence the visual search, i.e. the task to find a specific object in a scene. Therefore, we combine positional information obtained from pointing gestures with contextual knowledge about the visual appearance of the referred-to object obtained from language. The available information is then integrated into a biologically-motivated saliency model that forms the basis for visual search. We prove the feasibility of the proposed approach by presenting the results of an experimental evaluation.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130411586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891959
Rongrong Wang, Francis K. H. Quek, J. Teh, A. Cheok, Sep Riang Lai
{"title":"Design and evaluation of a wearable remote social touch device","authors":"Rongrong Wang, Francis K. H. Quek, J. Teh, A. Cheok, Sep Riang Lai","doi":"10.1145/1891903.1891959","DOIUrl":"https://doi.org/10.1145/1891903.1891959","url":null,"abstract":"Psychological and sociological studies have established the essential role that touch plays in interpersonal communication. However this channel is largely ignored in current telecommunication technologies. We design and implement a remote touch armband with an electric motor actuator. This is paired with a touch input device in the form of a force-sensor-embedded smart phone case. When the smart phone is squeezed, the paired armband will be activated to simulate a squeeze on the user's upper arm. A usability study is conducted with 22 participants to evaluate the device in terms of perceptibility. The results show that users can easily perceive touch at different force levels.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"576 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131543945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891940
J. Landay
{"title":"Activity-based Ubicomp: a new research basis for the future of human-computer interaction","authors":"J. Landay","doi":"10.1145/1891903.1891940","DOIUrl":"https://doi.org/10.1145/1891903.1891940","url":null,"abstract":"Ubiquitous computing (Ubicomp) is bringing computing off the desktop and into our everyday lives. For example, an interactive display can be used by the family of an elder to stay in constant touch with the elder's everyday wellbeing, or by a group to visualize and share information about exercise and fitness. Mobile sensors, networks, and displays are proliferating worldwide in mobile phones, enabling this new wave of applications that are intimate with the user's physical world. In addition to being ubiquitous, these applications share a focus on high-level activities, which are long-term social processes that take place in multiple environments and are supported by complex computation and inference of sensor data. However, the promise of this Activity-based Ubicomp is unfulfilled, primarily due to methodological, design, and tool limitations in how we understand the dynamics of activities. The traditional cognitive psychology basis for human-computer interaction, which focuses on our short term interactions with technological artifacts, is insufficient for achieving the promise of Activity-based Ubicomp. We are developing design methodologies and tools, as well as activity recognition technologies, to both demonstrate the potential of Activity-based Ubicomp as well as to support designers in fruitfully creating these types of applications.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134104072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891942
S. Deena, Shaobo Hou, Aphrodite Galata
{"title":"Visual speech synthesis by modelling coarticulation dynamics using a non-parametric switching state-space model","authors":"S. Deena, Shaobo Hou, Aphrodite Galata","doi":"10.1145/1891903.1891942","DOIUrl":"https://doi.org/10.1145/1891903.1891942","url":null,"abstract":"We present a novel approach to speech-driven facial animation using a non-parametric switching state space model based on Gaussian processes. The model is an extension of the shared Gaussian process dynamical model, augmented with switching states. Audio and visual data from a talking head corpus are jointly modelled using the proposed method. The switching states are found using variable length Markov models trained on labelled phonetic data. We also propose a synthesis technique that takes into account both previous and future phonetic context, thus accounting for coarticulatory effects in speech.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134125233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891933
Abdallah El Ali, F. Nack, L. Hardman
{"title":"Understanding contextual factors in location-aware multimedia messaging","authors":"Abdallah El Ali, F. Nack, L. Hardman","doi":"10.1145/1891903.1891933","DOIUrl":"https://doi.org/10.1145/1891903.1891933","url":null,"abstract":"Location-aware messages left by people can make visible some aspects of their everyday experiences at a location. To understand the contextual factors surrounding how users produce and consume location-aware multimedia messaging (LMM), we use an experience-centered framework that makes explicit the different aspects of an experience. Using this framework, we conducted an exploratory, diary study aimed at eliciting implications for the study and design of LMM systems. In an earlier pilot study, we found that subjects did not have enough time to fully capture their everyday experiences using an LMM prototype, which led us to conduct a longer study using a multimodal diary method. The diary study data (verified for reliability using a categorization task) provided a closer look at the different aspects (spatiotemporal, social, affective, and cognitive) of people's experience. From the data, we derive three main findings (predominant LMM domains and tasks, capturing experience vs. experience of capture, context-dependent personalization) to inform the study and design of future LMM systems.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"7 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116401609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891920
Chunhui Zhang, Min Wang, R. Harper
{"title":"Cloud mouse: a new way to interact with the cloud","authors":"Chunhui Zhang, Min Wang, R. Harper","doi":"10.1145/1891903.1891920","DOIUrl":"https://doi.org/10.1145/1891903.1891920","url":null,"abstract":"In this paper we present a novel input device and associated UI metaphors for Cloud computing. Cloud computing will give users access to huge amount of data in new forms as well as anywhere and anytime, with applications ranging from Web data mining to social networks. The motivation of this work is to provide users access to cloud computing by a new personal device and to make nearby displays a personal displayer. The key points of this device are direct-point operation, grasping UI and tangible feedback. A UI metaphor for cloud computing is also introduced.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117321186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891922
R. Ashley
{"title":"Musical performance as multimodal communication: drummers, musical collaborators, and listeners","authors":"R. Ashley","doi":"10.1145/1891903.1891922","DOIUrl":"https://doi.org/10.1145/1891903.1891922","url":null,"abstract":"Musical performance provides an interesting domain for understanding and investigating multimodal communication. Although the primary modality of music is auditory, musicians make considerable use of the visual channel as well. This talk examines musical performance as multimodal, focusing on drumming in one style of popular music (funk or soul music). The way drummers interact with, and communicate with, their musical collaborators and with listeners are examined, in terms of the structure of different musical parts; processes of mutual coordination, entrainment, and turn-taking (complementarity) are highlighted. Both pre-determined (composed) and spontaneous (improvised) behaviors are considered. The way in which digital drumsets function as complexly structured human interfaces to sound synthesis systems is examined as well.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114223116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891949
Toni Pakkanen, R. Raisamo, Katri Salminen, Veikko Surakka
{"title":"Haptic numbers: three haptic representation models for numbers on a touch screen phone","authors":"Toni Pakkanen, R. Raisamo, Katri Salminen, Veikko Surakka","doi":"10.1145/1891903.1891949","DOIUrl":"https://doi.org/10.1145/1891903.1891949","url":null,"abstract":"Systematic research on haptic stimuli is needed to create viable haptic feeling for user interface elements. There has been a lot of research with haptic user interface prototypes, but much less with haptic stimulus design. In this study we compared three haptic representation models with two representation rates for the numbers used in the phone number keypad layout. Haptic representations for the numbers were derived from Arabic and Roman numbers, and from the Location of the number button in the layout grid. Using a Nokia 5800 Express Music phone participants entered phone numbers blindly in the phone. The speed, error rate, and subjective experiences were recorded. The results showed that the model had no effect to the measured performance, but subjective experiences were affected. The Arabic numbers with slower speed were preferred most. Thus, subjectively the performance was rated as better, even though objective measures showed no differences.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134034812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}