Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, Soo-Young Lee
{"title":"Hierarchical Committee of Deep CNNs with Exponentially-Weighted Decision Fusion for Static Facial Expression Recognition","authors":"Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, Soo-Young Lee","doi":"10.1145/2818346.2830590","DOIUrl":"https://doi.org/10.1145/2818346.2830590","url":null,"abstract":"We present a pattern recognition framework to improve committee machines of deep convolutional neural networks (deep CNNs) and its application to static facial expression recognition in the wild (SFEW). In order to generate enough diversity of decisions, we trained multiple deep CNNs by varying network architectures, input normalization, and weight initialization as well as by adopting several learning strategies to use large external databases. Moreover, with these deep models, we formed hierarchical committees using the validation-accuracy-based exponentially-weighted average (VA-Expo-WA) rule. Through extensive experiments, the great strengths of our committee machines were demonstrated in both structural and decisional ways. On the SFEW2.0 dataset released for the 3rd Emotion Recognition in the Wild (EmotiW) sub-challenge, a test accuracy of 57.3% was obtained from the best single deep CNN, while the single-level committees yielded 58.3% and 60.5% with the simple average rule and with the VA-Expo-WA rule, respectively. Our final submission based on the 3-level hierarchy using the VA-Expo-WA achieved 61.6%, significantly higher than the SFEW baseline of 39.1%.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79678040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Oral Session 5: Interaction Techniques","authors":"S. Oviatt","doi":"10.1145/3252450","DOIUrl":"https://doi.org/10.1145/3252450","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84746040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Keynote Address 1","authors":"Zhengyou Zhang","doi":"10.1145/3252443","DOIUrl":"https://doi.org/10.1145/3252443","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"46 5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83182668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian J. A. M. Willemse, G. M. Munters, J. V. Erp, D. Heylen
{"title":"Nakama: A Companion for Non-verbal Affective Communication","authors":"Christian J. A. M. Willemse, G. M. Munters, J. V. Erp, D. Heylen","doi":"10.1145/2818346.2823299","DOIUrl":"https://doi.org/10.1145/2818346.2823299","url":null,"abstract":"We present \"Nakama\": A communication device that supports affective communication between a child and its - geographically separated - parent. Nakama consists of a control unit at the parent's end and an actuated teddy bear for the child. The bear contains several communication channels, including social touch, temperature, and vibrotactile heartbeats; all aimed at increasing the sense of presence. The current version of Nakama is suitable for user evaluations in lab settings, with which we aim to gain a more thorough understanding of the opportunities and limitations of these less traditional communication channels.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83638041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Oral Session 6: Mobile and Wearable","authors":"M. Johnston","doi":"10.1145/3252451","DOIUrl":"https://doi.org/10.1145/3252451","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88942069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dana Hughes, N. Farrow, Halley P. Profita, N. Correll
{"title":"Detecting and Identifying Tactile Gestures using Deep Autoencoders, Geometric Moments and Gesture Level Features","authors":"Dana Hughes, N. Farrow, Halley P. Profita, N. Correll","doi":"10.1145/2818346.2830601","DOIUrl":"https://doi.org/10.1145/2818346.2830601","url":null,"abstract":"While several sensing modalities and transduction approaches have been developed for tactile sensing in robotic skins, there has been much less work towards extracting features for or identifying high-level gestures performed on the skin. In this paper, we investigate using deep neural networks with hidden Markov models (DNN-HMMs), geometric moments and gesture level features to identify a set of gestures performed on robotic skins. We demonstrate that these features are useful for identifying gestures, and predict a set of gestures from a 14-class dataset with 56% accuracy, and a 7-class dataset with 71% accuracy.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"72 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76547778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interaction Studies with Social Robots","authors":"K. Dautenhahn","doi":"10.1145/2818346.2818347","DOIUrl":"https://doi.org/10.1145/2818346.2818347","url":null,"abstract":"Over the past 10 years we have seen worldwide an immense growth of research and development into companion robots. Those are robots that fulfil particular tasks, but do so in a socially acceptable manner. The companionship aspect reflects the repeated and long-term nature of such interactions, and the potential of people to form relationships with such robots, e.g. as friendly assistants. A number of companion and assistant robots have been entering the market, two of the latest examples are Aldebaran's Pepper robot, or Jibo (Cynthia Breazeal). Companion robots are more and more targeting particular application areas, e.g. as home assistants or therapeutic tools. Research into companion robots needs to address many fundamental research problems concerning perception, cognition, action and learning, but regardless how sophisticated our robotic systems may be, the potential users need to be taken into account from the early stages of development. The talk will emphasize the need for a highly user-centred approach towards design, development and evaluation of companion robots. An important challenge is to evaluate robots in realistic and long-term scenarios, in order to capture as closely as possible those key aspects that will play a role when using such robots in the real world. In order to illustrate these points, my talk will give examples of interaction studies that my research team has been involved in. This includes studies into how people perceive robots' non-verbal cues, creating and evaluating realistic scenarios for home companion robots using narrative framing, and verbal and tactile interaction of children with the therapeutic and social robot Kaspar. The talk will highlight the issues we encountered when we proceeded from laboratory-based experiments and prototypes to real-world applications.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78326714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild","authors":"Anbang Yao, Junchao Shao, Ningning Ma, Yurong Chen","doi":"10.1145/2818346.2830585","DOIUrl":"https://doi.org/10.1145/2818346.2830585","url":null,"abstract":"The Emotion Recognition in the Wild (EmotiW) Challenge has been held for three years. Previous winner teams primarily focus on designing specific deep neural networks or fusing diverse hand-crafted and deep convolutional features. They all neglect to explore the significance of the latent relations among changing features resulted from facial muscle motions. In this paper, we study this recognition challenge from the perspective of analyzing the relations among expression-specific facial features in an explicit manner. Our method has three key components. First, we propose a pair-wise learning strategy to automatically seek a set of facial image patches which are important for discriminating two particular emotion categories. We found these learnt local patches are in part consistent with the locations of expression-specific Action Units (AUs), thus the features extracted from such kind of facial patches are named AU-aware facial features. Second, in each pair-wise task, we use an undirected graph structure, which takes learnt facial patches as individual vertices, to encode feature relations between any two learnt facial patches. Finally, a robust emotion representation is constructed by concatenating all task-specific graph-structured facial feature relations sequentially. Extensive experiments on the EmotiW 2015 Challenge testify the efficacy of the proposed approach. Without using additional data, our final submissions achieved competitive results on both sub-challenges including the image based static facial expression recognition (we got 55.38% recognition accuracy outperforming the baseline 39.13% with a margin of 16.25%) and the audio-video based emotion recognition (we got 53.80% recognition accuracy outperforming the baseline 39.33% and the 2014 winner team's final result 50.37% with the margins of 14.47% and 3.43%, respectively).","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"219 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77767764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhinav Dhall, O. V. R. Murthy, Roland Göcke, Jyoti Joshi, Tom Gedeon
{"title":"Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015","authors":"Abhinav Dhall, O. V. R. Murthy, Roland Göcke, Jyoti Joshi, Tom Gedeon","doi":"10.1145/2818346.2829994","DOIUrl":"https://doi.org/10.1145/2818346.2829994","url":null,"abstract":"The third Emotion Recognition in the Wild (EmotiW) challenge 2015 consists of an audio-video based emotion and static image based facial expression classification sub-challenges, which mimics real-world conditions. The two sub-challenges are based on the Acted Facial Expression in the Wild (AFEW) 5.0 and the Static Facial Expression in the Wild (SFEW) 2.0 databases, respectively. The paper describes the data, baseline method, challenge protocol and the challenge results. A total of 12 and 17 teams participated in the video based emotion and image based expression sub-challenges, respectively.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"278 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80072845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention and Engagement Aware Multimodal Conversational Systems","authors":"Zhou Yu","doi":"10.1145/2818346.2823309","DOIUrl":"https://doi.org/10.1145/2818346.2823309","url":null,"abstract":"Despite their ability to complete certain tasks, dialog systems still suffer from poor adaptation to users' engagement and attention. We observe human behaviors in different conversational settings to understand human communication dynamics and then transfer the knowledge to multimodal dialog system design. To focus solely on maintaining engaging conversations, we design and implement a non-task oriented multimodal dialog system, which serves as a framework for controlled multimodal conversation analysis. We design computational methods to model user engagement and attention in real time by leveraging automatically harvested multimodal human behaviors, such as smiles and speech volume. We aim to design and implement a multimodal dialog system to coordinate with users' engagement and attention on the fly via techniques such as adaptive conversational strategies and incremental speech production.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"185 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82917580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}