Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献_第7页

Automatic Detection of Mind Wandering During Reading Using Gaze and Physiology 用凝视和生理学自动检测阅读时的走神

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820742

R. Bixler, Nathaniel Blanchard, L. Garrison, S. D’Mello

{"title":"Automatic Detection of Mind Wandering During Reading Using Gaze and Physiology","authors":"R. Bixler, Nathaniel Blanchard, L. Garrison, S. D’Mello","doi":"10.1145/2818346.2820742","DOIUrl":"https://doi.org/10.1145/2818346.2820742","url":null,"abstract":"Mind wandering (MW) entails an involuntary shift in attention from task-related thoughts to task-unrelated thoughts, and has been shown to have detrimental effects on performance in a number of contexts. This paper proposes an automated multimodal detector of MW using eye gaze and physiology (skin conductance and skin temperature) and aspects of the context (e.g., time on task, task difficulty). Data in the form of eye gaze and physiological signals were collected as 178 participants read four instructional texts from a computer interface. Participants periodically provided self-reports of MW in response to pseudorandom auditory probes during reading. Supervised machine learning models trained on features extracted from participants' gaze fixations, physiological signals, and contextual cues were used to detect pages where participants provided positive responses of MW to the auditory probes. Two methods of combining gaze and physiology features were explored. Feature level fusion entailed building a single model by combining feature vectors from individual modalities. Decision level fusion entailed building individual models for each modality and adjudicating amongst individual decisions. Feature level fusion resulted in an 11% improvement in classification accuracy over the best unimodal model, but there was no comparable improvement for decision level fusion. This was reflected by a small improvement in both precision and recall. An analysis of the features indicated that MW was associated with fewer and longer fixations and saccades, and a higher more deterministic skin temperature. Possible applications of the detector are discussed.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76362634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Multimodal Affect Detection in the Wild: Accuracy, Availability, and Generalizability 野外多模态情感检测:准确性、可用性和可泛化性

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823316

Nigel Bosch

引用次数: 9

Who's Speaking?: Audio-Supervised Classification of Active Speakers in Video 说话的是谁?视频中主动说话者的音频监督分类

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820780

Punarjay Chakravarty, S. Mirzaei, T. Tuytelaars, H. V. hamme

引用次数: 35

Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken Language Understanding 利用移动应用程序的行为模式实现个性化口语理解

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820781

Yun-Nung (Vivian) Chen, Ming Sun, Alexander I. Rudnicky, A. Gershman

引用次数: 32

Recurrent Neural Networks for Emotion Recognition in Video 视频中情感识别的递归神经网络

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830596

S. Kahou, Vincent Michalski, K. Konda, R. Memisevic, C. Pal

{"title":"Recurrent Neural Networks for Emotion Recognition in Video","authors":"S. Kahou, Vincent Michalski, K. Konda, R. Memisevic, C. Pal","doi":"10.1145/2818346.2830596","DOIUrl":"https://doi.org/10.1145/2818346.2830596","url":null,"abstract":"Deep learning based approaches to facial analysis and video analysis have recently demonstrated high performance on a variety of key tasks such as face recognition, emotion recognition and activity recognition. In the case of video, information often must be aggregated across a variable length sequence of frames to produce a classification result. Prior work using convolutional neural networks (CNNs) for emotion recognition in video has relied on temporal averaging and pooling operations reminiscent of widely used approaches for the spatial aggregation of information. Recurrent neural networks (RNNs) have seen an explosion of recent interest as they yield state-of-the-art performance on a variety of sequence analysis tasks. RNNs provide an attractive framework for propagating information over a sequence using a continuous valued hidden layer representation. In this work we present a complete system for the 2015 Emotion Recognition in the Wild (EmotiW) Challenge. We focus our presentation and experimental analysis on a hybrid CNN-RNN architecture for facial expression analysis that can outperform a previously applied CNN approach using temporal averaging for aggregation.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74090351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 321

Detecting Mastication: A Wearable Approach 检测咀嚼:一种可穿戴的方法

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820767

Abdelkareem Bedri, Apoorva Verlekar, Edison Thomaz, Valerie Avva, Thad Starner

引用次数: 54

Implicit Human-computer Interaction: Two Complementary Approaches 隐式人机交互:两种互补的方法

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823311

Julia Wache

{"title":"Implicit Human-computer Interaction: Two Complementary Approaches","authors":"Julia Wache","doi":"10.1145/2818346.2823311","DOIUrl":"https://doi.org/10.1145/2818346.2823311","url":null,"abstract":"One of the main goals in Human Computer Interaction (HCI) is improving the interface between users and computers: Interfacing should be intuitive, effortless and easy to learn. We approach the goal from two opposite but complementary directions: On the one hand, computer-user interaction can be enhanced if the computer can assess users differences in an automated manner. Therefore we collected physiological and psychological data from people exposed to emotional stimuli and created a database for the community to use for further research in the context of automated learning to detect the differences in the inner states of users. We employed the data both to not only predict the emotional state of users but also their personality traits. On the other hand, users need information dispatched by a computer to be easily, intuitively accessible. To minimize the cognitive effort of assimilating information we use a tactile device in form of a belt and test how it can be best used to replace or augment the information received from other senses (e.g., visual and auditory) in a navigation task. We investigate how both approaches can be combined to improve specific applications.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78006128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Classification of Children's Social Dominance in Group Interactions with Robots 儿童与机器人群体互动的社会优势分类

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820735

Sarah Strohkorb, Iolanda Leite, Natalie Warren, B. Scassellati

{"title":"Classification of Children's Social Dominance in Group Interactions with Robots","authors":"Sarah Strohkorb, Iolanda Leite, Natalie Warren, B. Scassellati","doi":"10.1145/2818346.2820735","DOIUrl":"https://doi.org/10.1145/2818346.2820735","url":null,"abstract":"As social robots become more widespread in educational environments, their ability to understand group dynamics and engage multiple children in social interactions is crucial. Social dominance is a highly influential factor in social interactions, expressed through both verbal and nonverbal behaviors. In this paper, we present a method for determining whether a participant is high or low in social dominance in a group interaction with children and robots. We investigated the correlation between many verbal and nonverbal behavioral features with social dominance levels collected through teacher surveys. We additionally implemented Logistic Regression and Support Vector Machines models with classification accuracies of 81% and 89%, respectively, showing that using a small subset of nonverbal behavioral features, these models can successfully classify children's social dominance level. Our approach for classifying social dominance is novel not only for its application to children, but also for achieving high classification accuracies using a reduced set of nonverbal features that, in future work, can be automatically extracted with current sensing technology.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"32 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91488971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Exploring Intent-driven Multimodal Interface for Geographical Information System 探索意图驱动的多模式地理信息系统接口

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823304

Feng Sun

{"title":"Exploring Intent-driven Multimodal Interface for Geographical Information System","authors":"Feng Sun","doi":"10.1145/2818346.2823304","DOIUrl":"https://doi.org/10.1145/2818346.2823304","url":null,"abstract":"Geographic Information Systems (GIS) offers a large amount of functions for performing spatial analysis and geospatial information retrieval. However, off-the-shelf GIS remains difficult to use for occasional GIS experts. The major problem lies in that its interface organizes spatial analysis tools and functions according to spatial data structures and corresponding algorithms, which is conceptually confusing and cognitively complex. Prior work identified the usability problem of conventional GIS interface and developed alternatives based on speech or gesture to narrow the gap between the high-functionality provided by GIS and its usability. This paper outlined my doctoral research goal in understanding human-GIS interaction activity, especially how interaction modalities assist to capture spatial analysis intention and influence collaborative spatial problem solving. We proposed a framework for enabling multimodal human-GIS interaction driven by intention. We also implemented a prototype GeoEASI (Geo-dialogue Environment for Assisted Spatial Inquiry) to demonstrate the effectiveness of our framework. GeoEASI understands commonly known spatial analysis intentions through multimodal techniques and is able to assist users to perform spatial analysis with proper strategies. Further work will evaluate the effectiveness of our framework, improve the reliability and flexibility of the system, extend the GIS interface for supporting multiple users, and integrate the system into GeoDeliberation. We will concentrate on how multimodality technology can be adopted in these circumstances and explore the potentials of it. The study aims to demonstrate the feasibility of building a GIS to be both useful and usable by introducing an intent-driven multimodal interface, forming the key to building a better theory of spatial thinking for GIS.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86350987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Detecting and Synthesizing Synchronous Joint Action in Human-Robot Teams 人-机器人团队同步关节动作的检测与综合

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823315

T. Iqbal, L. Riek

引用次数: 3