Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献_第3页

Transductive Transfer LDA with Riesz-based Volume LBP for Emotion Recognition in The Wild 基于riesz的体积LBP在野外情绪识别中的转导传递LDA

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830584

Yuan Zong, Wenming Zheng, Xiaohua Huang, Jingwei Yan, T. Zhang

引用次数: 12

An Experiment on the Feasibility of Spatial Acquisition using a Moving Auditory Cue for Pedestrian Navigation 基于运动听觉线索的行人导航空间获取可行性实验

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820779

Yeseul Park, Kyle Koh, Heonjin Park, Jinwook Seo

引用次数: 2

Record, Transform & Reproduce Social Encounters in Immersive VR: An Iterative Approach 在沉浸式VR中记录、转换和再现社交遭遇:一种迭代方法

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823314

Jan Kolkmeier

{"title":"Record, Transform & Reproduce Social Encounters in Immersive VR: An Iterative Approach","authors":"Jan Kolkmeier","doi":"10.1145/2818346.2823314","DOIUrl":"https://doi.org/10.1145/2818346.2823314","url":null,"abstract":"Immersive Virtual Reality Environments that can be accessed through multimodal natural interfaces will bring new affordances to mediated interaction with virtual embodied agents and avatars. Such interfaces will measure, amongst others, users' poses and motion which can be copied to an embodied avatar representation of the user that is situated in a virtual or augmented reality space shared with autonomous virtual agents and human controlled or semi-autonomous avatars. Designers of such environments will be challenged to facilitate believable social interactions by creating agents or semi-autonomous avatars that can respond meaningfully to users' natural behaviors, as captured by these interfaces. In our future research, we aim to realize such interactions to create rich social encounters in immersive Virtual Reality. In this current work, we present the approach we envisage to analyze and learn agent behavior from human-agent interaction in an iterative fashion. We specifically look at small-scale, `regulative' nonverbal behaviors. Agents inform their behavior on previous observations, observing responses that these behaviors elicit in new users, thus iteratively generating corpora of short, situated human-agent interaction sequences that are to be analyzed, annotated and processed to generate socially intelligent agent behavior. Some choices and challenges of this approach are discussed.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80740630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns 基于卷积神经网络和映射二进制模式的野外情绪识别

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830587

Gil Levi, Tal Hassner

{"title":"Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns","authors":"Gil Levi, Tal Hassner","doi":"10.1145/2818346.2830587","DOIUrl":"https://doi.org/10.1145/2818346.2830587","url":null,"abstract":"We present a novel method for classifying emotions from static facial images. Our approach leverages on the recent success of Convolutional Neural Networks (CNN) on face recognition problems. Unlike the settings often assumed there, far less labeled data is typically available for training emotion classification systems. Our method is therefore designed with the goal of simplifying the problem domain by removing confounding factors from the input images, with an emphasis on image illumination variations. This, in an effort to reduce the amount of data required to effectively train deep CNN models. To this end, we propose novel transformations of image intensities to 3D spaces, designed to be invariant to monotonic photometric transformations. These are applied to CASIA Webface images which are then used to train an ensemble of multiple architecture CNNs on multiple representations. Each model is then fine-tuned with limited emotion labeled training data to obtain final classification models. Our method was tested on the Emotion Recognition in the Wild Challenge (EmotiW 2015), Static Facial Expression Recognition sub-challenge (SFEW) and shown to provide a substantial, 15.36% improvement over baseline results (40% gain in performance).","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78837247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 303

The Grenoble System for the Social Touch Challenge at ICMI 2015 2015年ICMI社交接触挑战的格勒诺布尔系统

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830598

Viet-Cuong Ta, W. Johal, Maxime Portaz, Eric Castelli, D. Vaufreydaz

{"title":"The Grenoble System for the Social Touch Challenge at ICMI 2015","authors":"Viet-Cuong Ta, W. Johal, Maxime Portaz, Eric Castelli, D. Vaufreydaz","doi":"10.1145/2818346.2830598","DOIUrl":"https://doi.org/10.1145/2818346.2830598","url":null,"abstract":"New technologies and especially robotics is going towards more natural user interfaces. Works have been done in different modality of interaction such as sight (visual computing), and audio (speech and audio recognition) but some other modalities are still less researched. The touch modality is one of the less studied in HRI but could be valuable for naturalistic interaction. However touch signals can vary in semantics. It is therefore necessary to be able to recognize touch gestures in order to make human-robot interaction even more natural. We propose a method to recognize touch gestures. This method was developed on the CoST corpus and then directly applied on the HAART dataset as a participation of the Social Touch Challenge at ICMI 2015. Our touch gesture recognition process is detailed in this article to make it reproducible by other research teams. Besides features set description, we manually filtered the training corpus to produce 2 datasets. For the challenge, we submitted 6 different systems. A Support Vector Machine and a Random Forest classifiers for the HAART dataset. For the CoST dataset, the same classifiers are tested in two conditions: using all or filtered training datasets. As reported by organizers, our systems have the best correct rate in this year's challenge (70.91% on HAART, 61.34% on CoST). Our performances are slightly better that other participants but stay under previous reported state-of-the-art results.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86553547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Software Techniques for Multimodal Input Processing in Realtime Interactive Systems 实时交互系统中多模态输入处理软件技术

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823308

Martin Fischbach

引用次数: 4

Effects of Good Speaking Techniques on Audience Engagement 良好的演讲技巧对听众参与的影响

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2820766

Keith Curtis, G. Jones, N. Campbell

{"title":"Effects of Good Speaking Techniques on Audience Engagement","authors":"Keith Curtis, G. Jones, N. Campbell","doi":"10.1145/2818346.2820766","DOIUrl":"https://doi.org/10.1145/2818346.2820766","url":null,"abstract":"Understanding audience engagement levels for presentations has the potential to enable richer and more focused interaction with audio-visual recordings. We describe an investigation into automated analysis of multimodal recordings of scientific talks where the use of modalities most typically associated with engagement such as eye-gaze is not feasible. We first study visual and acoustic features to identify those most commonly associated with good speaking techniques. To understand audience interpretation of good speaking techniques, we angaged human annotators to rate the qualities of the speaker for a series of 30-second video segments taken from a corpus of 9 hours of presentations from an academic conference. Our annotators also watched corresponding video recordings of the audience to presentations to estimate the level of audience engagement for each talk. We then explored the effectiveness of multimodal features extracted from the presentation video against Likert-scale ratings of each speaker as assigned by the annotators. and on manually labelled audience engagement levels. These features were used to build a classifier to rate the qualities of a new speaker. This was able classify a rating for a presenter over an 8-class range with an accuracy of 52%. By combining these classes to a 4-class range accuracy increases to 73%. We analyse linear correlations with individual speaker-based modalities and actual audience engagement levels to understand the corresponding effect on audience engagement. A further classifier was then built to predict the level of audience engagement to a presentation by analysing the speaker's use of acoustic and visual cues. Using these speaker based modalities pre-fused with speaker ratings only, we are able to predict actual audience engagement levels with an accuracy of 68%. By combining with basic visual features from the audience as whole, we are able to improve this to an accuracy of 70%.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"112 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79555617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Multiple Models Fusion for Emotion Recognition in the Wild 基于多模型融合的野外情绪识别

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2830582

Jianlong Wu, Zhouchen Lin, H. Zha

引用次数: 34

Gait and Postural Sway Analysis, A Multi-Modal System 步态和姿态摇摆分析，一个多模态系统

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/2818346.2823310

Hafsa Ismail

引用次数: 3

Session details: Poster Session 会议详情:海报会议

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI: 10.1145/3252452

R. Horaud, D. Bohus

引用次数: 0