ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891966
M. Voit, R. Stiefelhagen
{"title":"3D user-perspective, voxel-based estimation of visual focus of attention in dynamic meeting scenarios","authors":"M. Voit, R. Stiefelhagen","doi":"10.1145/1891903.1891966","DOIUrl":"https://doi.org/10.1145/1891903.1891966","url":null,"abstract":"In this paper we present a new framework for the online estimation of people's visual focus of attention from their head poses in dynamic meeting scenarios. We describe a voxel based approach to reconstruct the scene composition from an observer's perspective, in order to integrate occlusion handling and visibility verification. The observer's perspective is thereby simulated with live head pose tracking over four far-field views from the room's upper corners. We integrate motion and speech activity as further scene observations in a Bayesian Surprise framework to model prior attractors of attention within the situation's context. As evaluations on a dedicated dataset with 10 meeting videos show, this allows us to predict a meeting participant's focus of attention correctly in up to 72.2% of all frames.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125536154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891944
Stefanie Tellex, T. Kollar, George Shaw, N. Roy, D. Roy
{"title":"Grounding spatial language for video search","authors":"Stefanie Tellex, T. Kollar, George Shaw, N. Roy, D. Roy","doi":"10.1145/1891903.1891944","DOIUrl":"https://doi.org/10.1145/1891903.1891944","url":null,"abstract":"The ability to find a video clip that matches a natural language description of an event would enable intuitive search of large databases of surveillance video. We present a mechanism for connecting a spatial language query to a video clip corresponding to the query. The system can retrieve video clips matching millions of potential queries that describe complex events in video such as \"people walking from the hallway door, around the island, to the kitchen sink.\" By breaking down the query into a sequence of independent structured clauses and modeling the meaning of each component of the structure separately, we are able to improve on previous approaches to video retrieval by finding clips that match much longer and more complex queries using a rich set of spatial relations such as \"down\" and \"past.\" We present a rigorous analysis of the system's performance, based on a large corpus of task-constrained language collected from fourteen subjects. Using this corpus, we show that the system effectively retrieves clips that match natural language descriptions: 58.3% were ranked in the top two of ten in a retrieval task. Furthermore, we show that spatial relations play an important role in the system's performance.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131165400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891929
Koji Kamei, K. Shinozawa, Tetsushi Ikeda, A. Utsumi, T. Miyashita, N. Hagita
{"title":"Recommendation from robots in a real-world retail shop","authors":"Koji Kamei, K. Shinozawa, Tetsushi Ikeda, A. Utsumi, T. Miyashita, N. Hagita","doi":"10.1145/1891903.1891929","DOIUrl":"https://doi.org/10.1145/1891903.1891929","url":null,"abstract":"By applying network robot technologies, recommendation methods from E-Commerce are incorporated in a retail shop in the real world. We constructed an experimental shop environment where communication robots recommend specific items to the customers according to their purchasing behavior as observed by networked sensors. A recommendation scenario is implemented with three robots and investigated through an experiment. The results indicate that the participants stayed longer in front of the shelves when the communication robots tried to interact with them and were influenced to carry out similar purchasing behaviors as those observed earlier. Other results suggest that the probability of customers' zone transition can be used to anticipate their purchasing behavior.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116381431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891948
Myunghee Lee, G. Kim
{"title":"Empathetic video experience through timely multimodal interaction","authors":"Myunghee Lee, G. Kim","doi":"10.1145/1891903.1891948","DOIUrl":"https://doi.org/10.1145/1891903.1891948","url":null,"abstract":"In this paper, we describe a video playing system, named \"Empatheater,\" that is controlled by multimodal interaction. As the video is played, the user must interact and emulate predefined video \"events\" through multimodal guidance and whole body interaction (e.g. following the main character's motion or gestures). Without the timely interaction, the video stops. The system shows guidance information as how to properly react and continue the video playing. The purpose of such a system is to provide indirect experience (of the given video content) by eliciting the user to mimic and empathize with the main character. The user is given the illusion (suspended disbelief) of playing an active role in the unraveling video content. We discuss various features of the newly proposed interactive medium. In addition, we report on the results of the pilot study that was carried out to evaluate its user experience compared to passive video viewing and keyboard based video control.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128594165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891957
Masahiro Tada, H. Noma, K. Renge
{"title":"Evidence-based automated traffic hazard zone mapping using wearable sensors","authors":"Masahiro Tada, H. Noma, K. Renge","doi":"10.1145/1891903.1891957","DOIUrl":"https://doi.org/10.1145/1891903.1891957","url":null,"abstract":"Recently, underestimating traffic condition risk is considered one of the biggest reasons for traffic accidents. In this paper, we proposed evidence-based automatic hazard zone mapping method using wearable sensors. Here, we measure driver's behavior using three-axis gyro sensors. Analyzing the measured motion data, proposed method can label characteristic motion that is observed at hazard zone. We gathered motion data sets form two types of driver, i.e., an instructor of driving school and an ordinary driver, then, tried to generate traffic hazard zone map focused on difference of the motions. Through the experiment in public road, we confirmed our method allows to extract hazard zone.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128421379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891918
Luis Rodríguez, I. García-Varea, Alejandro Revuelta-Martínez, E. Vidal
{"title":"A multimodal interactive text generation system","authors":"Luis Rodríguez, I. García-Varea, Alejandro Revuelta-Martínez, E. Vidal","doi":"10.1145/1891903.1891918","DOIUrl":"https://doi.org/10.1145/1891903.1891918","url":null,"abstract":"We present an interactive text generation system aimed at providing assistance for text typing in different environments. This system works by predicting what the user is going to type based on the text he or she typed previously. A multimodal interface is included, intended to facilitate the text generation in constrained environments. The prototype is designed following a modular client-server architecture to provide a high flexibility.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114635072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cognitive skills learning: pen input patterns in computer-based athlete training","authors":"Natalie Ruiz, Qian Qian Feng, R. Taib, Tara Handke, Fang Chen","doi":"10.1145/1891903.1891955","DOIUrl":"https://doi.org/10.1145/1891903.1891955","url":null,"abstract":"In this paper, we describe a longitudinal user study with athletes using a cognitive training tool, equipped with an interactive pen interface, and think-aloud protocols. The aim is to verify whether cognitive load can be inferred directly from changes in geometric and temporal features of the pen trajectories. We compare trajectories across cognitive load levels and overall Pre and Post training tests. The results show trajectory durations and lengths decrease while speeds increase, all significantly, as cognitive load increases. These changes are attributed to mechanisms for dealing with high cognitive load in working memory, with minimal rehearsal. With more expertise, trajectory durations further decrease and speeds further increase, which is attributed in part to cognitive skill acquisition and to schema development, both in extraneous and intrinsic networks, between Pre and Post tests. As such, these pen trajectory features offer insight into implicit communicative changes related to load fluctuations.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131607583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891937
Peng-Wen Chen, Snehal Kumar Chennuru, S. Buthpitiya, Y. Zhang
{"title":"A language-based approach to indexing heterogeneous multimedia lifelog","authors":"Peng-Wen Chen, Snehal Kumar Chennuru, S. Buthpitiya, Y. Zhang","doi":"10.1145/1891903.1891937","DOIUrl":"https://doi.org/10.1145/1891903.1891937","url":null,"abstract":"Lifelog systems, inspired by Vannevar Bush's concept of \"MEMory EXtenders\" (MEMEX), are capable of storing a person's lifetime experience as a multimedia database. Despite such systems' huge potential for improving people's everyday life, there are major challenges that need to be addressed to make such systems practical. One of them is how to index the inherently large and heterogeneous lifelog data so that a person can efficiently retrieve the log segments that are of interest. In this paper, we present a novel approach to indexing lifelogs using activity language. By quantizing the heterogeneous high dimensional sensory data into text representation, we are able to apply statistical natural language processing techniques to index, recognize, segment, cluster, retrieve, and infer high-level semantic meanings of the collected lifelogs. Based on this indexing approach, our lifelog system supports easy retrieval of log segments representing past similar activities and generation of salient summaries serving as overviews of segments.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117179243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891950
Juan Cheng, Xiang Chen, Zhiyuan Lu, Kongqiao Wang, M. Shen
{"title":"Key-press gestures recognition and interaction based on SEMG signals","authors":"Juan Cheng, Xiang Chen, Zhiyuan Lu, Kongqiao Wang, M. Shen","doi":"10.1145/1891903.1891950","DOIUrl":"https://doi.org/10.1145/1891903.1891950","url":null,"abstract":"This article conducted research on the pattern recognition of keypress finger gestures based on surface electromyographic (SEMG) signals and the feasibility of key -press gestures for interaction application. Two sort of recognition experiments were designed firstly to explore the feasibility and repeatability of the SEMG -based classification of 1 6 key-press finger gestures relating to right hand and 4 control gestures, and the key -press gestures were defined referring to the standard PC key board. Based on the experimental results, 10 quite well recognized key -press gestures were selected as numeric input keys of a simulated phone, and the 4 control gestures were mapped to 4 control keys. Then two types of use tests, namely volume setting and SMS sending were conducted to survey the gesture-base interaction performance and user's attitude to this technique, and the test results showed that users could accept this novel input strategy with fresh experience.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"250 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124751540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ICMI-MLMI '10Pub Date : 2010-11-08DOI: 10.1145/1891903.1891915
Nikolaus Bee, J. Wagner, E. André, Thurid Vogt, Fred Charles, D. Pizzi, M. Cavazza
{"title":"Discovering eye gaze behavior during human-agent conversation in an interactive storytelling application","authors":"Nikolaus Bee, J. Wagner, E. André, Thurid Vogt, Fred Charles, D. Pizzi, M. Cavazza","doi":"10.1145/1891903.1891915","DOIUrl":"https://doi.org/10.1145/1891903.1891915","url":null,"abstract":"In this paper, we investigate the user's eye gaze behavior during the conversation with an interactive storytelling application. We present an interactive eye gaze model for embodied conversational agents in order to improve the experience of users participating in Interactive Storytelling. The underlying narrative in which the approach was tested is based on a classical XIXth century psychological novel: Madame Bovary, by Flaubert. At various stages of the narrative, the user can address the main character or respond to her using free-style spoken natural language input, impersonating her lover. An eye tracker was connected to enable the interactive gaze model to respond to user's current gaze (i.e. looking into the virtual character's eyes or not). We conducted a study with 19 students where we compared our interactive eye gaze model with a non-interactive eye gaze model that was informed by studies of human gaze behaviors, but had no information on where the user was looking. The interactive model achieved a higher score for user ratings than the non-interactive model. In addition we analyzed the users' gaze behavior during the conversation with the virtual character.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122054781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}