Proceedings of the 16th International Conference on Multimodal Interaction最新文献_第5页

User Independent Gaze Estimation by Exploiting Similarity Measures in the Eye Pair Appearance Eigenspace 利用眼睛外观特征空间相似性测度的用户独立注视估计

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663250

Nanxiang Li, C. Busso

{"title":"User Independent Gaze Estimation by Exploiting Similarity Measures in the Eye Pair Appearance Eigenspace","authors":"Nanxiang Li, C. Busso","doi":"10.1145/2663204.2663250","DOIUrl":"https://doi.org/10.1145/2663204.2663250","url":null,"abstract":"The design of gaze-based computer interfaces has been an active research area for over 40 years. One challenge of using gaze detectors is the repetitive calibration process required to adjust the parameters of the systems, and the constrained conditions imposed on the user for robust gaze estimation. We envision user-independent gaze detectors that do not require calibration, or any cooperation from the user. Toward this goal, we investigate an appearance-based approach, where we estimate the eigenspace for the gaze using principal component analysis (PCA). The projections are used as features of regression models that estimate the screen's coordinates. As expected, the performance of the approach decreases when the models are trained without data from the target user (i.e., user-independent condition). This study proposes an appealing training approach to bridge the gap in performance between user-dependent and user-independent conditions. Using the projections onto the eigenspace, the scheme identifies samples in training set that are similar to the testing images. We build the sample covariance matrix and the regression models only with these samples. We consider either similar frames or data from subjects with similar eye appearance. The promising results suggest that the proposed training approach is a feasible and convenient scheme for gaze-based multimodal interfaces.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132504077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Session details: Oral Session 2: Multimodal Fusion 会话内容:口语会话2:Multimodal Fusion

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/3246742

Björn Schuller

引用次数: 0

The SWELL Knowledge Work Dataset for Stress and User Modeling Research 用于压力和用户建模研究的SWELL知识工作数据集

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663257

Saskia Koldijk, Maya Sappelli, S. Verberne, Mark Antonius Neerincx, Wessel Kraaij

引用次数: 125

System for Presenting and Creating Smell Effects to Video 系统呈现和创建气味效果的视频

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663269

R. Suzuki, Shutaro Homma, Eri Matsuura, Ken-ichi Okada

引用次数: 16

Multimodal Interaction for Future Control Centers: An Interactive Demonstrator 未来控制中心的多模态交互:一个交互式演示器

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2669620

Ferdinand Fuhrmann, Rene Kaiser

引用次数: 4

Deception detection using a multimodal approach 使用多模态方法的欺骗检测

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663229

M. Abouelenien, Verónica Pérez-Rosas, Rada Mihalcea, Mihai Burzo

{"title":"Deception detection using a multimodal approach","authors":"M. Abouelenien, Verónica Pérez-Rosas, Rada Mihalcea, Mihai Burzo","doi":"10.1145/2663204.2663229","DOIUrl":"https://doi.org/10.1145/2663204.2663229","url":null,"abstract":"In this paper we address the automatic identification of deceit by using a multimodal approach. We collect deceptive and truthful responses using a multimodal setting where we acquire data using a microphone, a thermal camera, as well as physiological sensors. Among all available modalities, we focus on three modalities namely, language use, physiological response, and thermal sensing. To our knowledge, this is the first work to integrate these specific modalities to detect deceit. Several experiments are carried out in which we first select representative features for each modality, and then we analyze joint models that integrate several modalities. The experimental results show that the combination of features from different modalities significantly improves the detection of deceptive behaviors as compared to the use of one modality at a time. Moreover, the use of non-contact modalities proved to be comparable with and sometimes better than existing contact-based methods. The proposed method increases the efficiency of detecting deceit by avoiding human involvement in an attempt to move towards a completely automated non-invasive deception detection process.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126000741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Deep Multimodal Fusion: Combining Discrete Events and Continuous Signals 深度多模态融合:结合离散事件和连续信号

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2663236

H. P. Martínez, Georgios N. Yannakakis

{"title":"Deep Multimodal Fusion: Combining Discrete Events and Continuous Signals","authors":"H. P. Martínez, Georgios N. Yannakakis","doi":"10.1145/2663204.2663236","DOIUrl":"https://doi.org/10.1145/2663204.2663236","url":null,"abstract":"Multimodal datasets often feature a combination of continuous signals and a series of discrete events. For instance, when studying human behaviour it is common to annotate actions performed by the participant over several other modalities such as video recordings of the face or physiological signals. These events are nominal, not frequent and are not sampled at a continuous rate while signals are numeric and often sampled at short fixed intervals. This fundamentally different nature complicates the analysis of the relation among these modalities which is often studied after each modality has been summarised or reduced. This paper investigates a novel approach to model the relation between such modality types bypassing the need for summarising each modality independently of each other. For that purpose, we introduce a deep learning model based on convolutional neural networks that is adapted to process multiple modalities at different time resolutions we name deep multimodal fusion. Furthermore, we introduce and compare three alternative methods (convolution, training and pooling fusion) to integrate sequences of events with continuous signals within this model. We evaluate deep multimodal fusion using a game user dataset where player physiological signals are recorded in parallel with game events. Results suggest that the proposed architecture can appropriately capture multimodal information as it yields higher prediction accuracies compared to single-modality models. In addition, it appears that pooling fusion, based on a novel filter-pooling method provides the more effective fusion approach for the investigated types of data.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125031884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

A World without Barriers: Connecting the World across Languages, Distances and Media 一个没有障碍的世界:跨越语言、距离和媒介连接世界

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2669986

A. Waibel

{"title":"A World without Barriers: Connecting the World across Languages, Distances and Media","authors":"A. Waibel","doi":"10.1145/2663204.2669986","DOIUrl":"https://doi.org/10.1145/2663204.2669986","url":null,"abstract":"As our world becomes increasingly interdependent and globalization brings people together more than ever, we quickly discover that it is no longer the absence of connectivity (the \"digital divide\") that separates us, but that new and different forms of alienation still keep us apart, including language, culture, distance and interfaces. Can technology provide solutions to bring us closer to our fellow humans? In this talk, I will present multilingual and multimodal interface technology solutions that offer the best of both worlds: maintaining our cultural diversity and locale while providing for better communication, greater integration and collaboration. We explore: Smart phone based speech translators for everyday travelers and humanitarian missions Simultaneous translation systems and services to translate academic lectures and political speeches in real time (at Universities, the European Parliament and broadcasting services) Multimodal language-transparent interfaces and smartrooms to improve joint and distributed communication and interaction. We will first discuss the difficulties of language processing; review how the technology works today and what levels of performance are now possible. Key to today's systems is effective machine learning, without which scaling multilingual and multimodal systems to unlimited domains, modalities, accents, and more than 6,000 languages would be hopeless. Equally important are effective human-computer interfaces, so that language differences fade naturally into the background and communication and interaction become natural and engaging. I will present recent research results as well as examples from our field trials and deployments in educational, commercial, humanitarian and government settings.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129741869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Facial Expression Analysis for Estimating Pain in Clinical Settings 面部表情分析在临床疼痛评估中的应用

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2666282

Karan Sikka

{"title":"Facial Expression Analysis for Estimating Pain in Clinical Settings","authors":"Karan Sikka","doi":"10.1145/2663204.2666282","DOIUrl":"https://doi.org/10.1145/2663204.2666282","url":null,"abstract":"Pain assessment is vital for effective pain management in clinical settings. It is generally obtained via patient's self-report or observer's assessment. Both of these approaches suffer from several drawbacks such as unavailability of self-report, idiosyncratic use and observer bias. This work aims at developing automated machine learning based approaches for estimating pain in clinical settings. We propose to use facial expression information to accomplish current goals since previous studies have demonstrated consistency between facial behavior and experienced pain. Moreover, with recent advances in computer vision it is possible to design algorithms for identifying spontaneous expressions such as pain in more naturalistic conditions. Our focus is towards designing robust computer vision models for estimating pain in videos containing patient's facial behavior. In this regard we discuss different research problem, technical approaches and challenges that needs to be addressed. In this work we particularly highlight the problem of predicting self-report measures of pain intensity since this problem is not only more challenging but also received less attention. We also discuss our efforts towards collecting an in-situ pediatric pain dataset for validating these approaches. We conclude the paper by presenting some results on both UNBC Mc-Master Pain dataset and pediatric pain dataset.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126555811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Appearance based user-independent gaze estimation 基于外观的用户独立注视估计

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI: 10.1145/2663204.2666288

Nanxiang Li

{"title":"Appearance based user-independent gaze estimation","authors":"Nanxiang Li","doi":"10.1145/2663204.2666288","DOIUrl":"https://doi.org/10.1145/2663204.2666288","url":null,"abstract":"An ideal gaze user interface should be able to accurately estimates the user's gaze direction in a non-intrusive setting. Most studies on gaze estimation focus on the accuracy of the estimation results, imposing important constraints on the user such as no head movement, intrusive head mount setting and repetitive calibration process. Due to these limitations, most graphic user interfaces (GUIs) are reluctant to include gaze as an input modality. We envision user-independent gaze detectors for user computer interaction that do not impose any constraints on the users. We believe the appearance of the eye pairs, which implicitly reveals head pose, provides conclusive information on the gaze direction. More importantly, the relative appearance changes in the eye pairs due to the different gaze direction should be consistent among different human subjects. We collected a multimodal corpus (MSP-GAZE) to study and evaluate user independent, appearance based gaze estimation approaches. This corpus considers important factors that affect the appearance based gaze estimation: the individual difference, the head movement, and the distance between the user and the interface's screen. Using this database, our initial study focused on the eye pair appearance eigenspace approach, where the projections into the eye appearance eigenspace basis are used to build regression models to estimate the gaze position. We compare the results between user dependent (training and testing on the same subject) and user independent (testing subject is not included in the training data) models. As expected, due to the individual differences between subjects, the performance decreases when the models are trained without data from the target user. The study aims to reduce the gap between user dependent and user independent conditions.","PeriodicalId":389037,"journal":{"name":"Proceedings of the 16th International Conference on Multimodal Interaction","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121350365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0