Yingmei Guo, Mingxing Xu, Zhiyong Wu, Jianming Wu, Bin Su
{"title":"Multi-Scale Convolutional Recurrent Neural Network with Ensemble Method for Weakly Labeled Sound Event Detection","authors":"Yingmei Guo, Mingxing Xu, Zhiyong Wu, Jianming Wu, Bin Su","doi":"10.1109/ACIIW.2019.8925176","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925176","url":null,"abstract":"In this paper, we describe our contributions to the challenge of detection and classification of acoustic scenes and events. We propose multi-scale convolutional recurrent neural network(Multi-scale CRNN), a novel weakly-supervised learning framework for sound event detection. By integrating information from different time resolutions, the multi-scale method can capture both the fine-grained and coarse-grained features of sound events and model the temporal dependency including fine-grained dependency and long-term dependency. Furthermore, the ensemble method proposed in the paper reduces the frame-level prediction errors using classification results. The proposed method achieves 29.2% in the event-based F1-score and 1.40 in event-based error rate in development set of DCASE2018 task4 compared to the baseline of 14.1% F-value and 1.54 error rate [1].","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124179157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comfortability Detection for Adaptive Human-Robot Interactions","authors":"Maria Elena Lechuga Redondo","doi":"10.1109/ACIIW.2019.8925017","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925017","url":null,"abstract":"Recognizing emotional states from nonverbal cues is basic for any kind of social interaction. Extrapolating this capability to robots would definitely attribute them skills which might enhance their interactions with people. This thesis looks to achieve two main goals. The first one is to unravel the Comfortability concept, which we define as the persons internal agreement-acceptance to the situation that arises as a result of an interaction. The second and main goal is to build a robot-embedded system capable of recognizing this internal state, adapting its behavior accordingly. The recognition model will be developed by applying artificial intelligence techniques for temporal modeling data through visual information (body movements and facial expressions). Then, the adaptation model will take into account both the Comfortability perceived, as well as contextual information (concretely, the previous task performed) in order to decide the consecutive action that the robot will perform.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121295920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vicarious Value Learning and Inference in Human-Human and Human-Robot Interaction","authors":"Robert J. Lowe, A. Almer, P. Gander, C. Balkenius","doi":"10.1109/ACIIW.2019.8925235","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925235","url":null,"abstract":"Among the biggest challenges for researchers of human-robot interaction is imbuing robots with lifelong learning capacities that allow efficient interactions between humans and robots. In order to address this challenge we are developing computational mechanisms for a humanoid robotic agent utilizing both system 1 and system 2-like cognitive processing capabilities. At the core of this processing is a Social Affective Appraisal model that allows for vicarious value learning and inference. Using a multi-dimensional reinforcement learning approach the robotic agent learns affective value-based functions (system 1). This learning can ground representations of affective relations (predicates) relevant to interacting agents. In this article we discuss the existing theoretical basis for developing our neural network model as a system 1-like process. We also discuss initial ideas for developing system 2-like top-down/generative affective (semantic relation-based) processing. The aim of the symbolic-connectionist architectural development is to promote autonomous capabilities in humanoid robots for interacting efficiently/intelligently (recombinant application of learned associations) with humans in changing and challenging environments.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116382588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Facial Expression Recognition with Identity and Spatial-temporal Integrated Learning","authors":"Jianing Teng, Dong Zhang, Ming Li, Yudong Huang","doi":"10.1109/ACIIW.2019.8925212","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925212","url":null,"abstract":"Spatial-temporal structure of expression frames plays a critical role in the task of video based facial expression recognition (FER). In this paper, we propose a 3D CNN based framework to learn the spatial-temporal structure from expression frames for video-based FER. First, we use the data labeled with identities to train an identity network to capture the facial biometric features from expression frames. Second, we remove the impact of facial biometric features from the expression features and construct typical facial expression (TFE) features. Then, we feed the TFE features to a 3D network to discover spatial-temporal structure of expression frames. In the end, we feed the spatial-temporal vector to a fully-connected layer to get a vector for classification. The proposed method achieves comparable accuracy with the state-of-art of 88.54% on Oulu-CASIA, and is efficient to be used for the task of video-based FER.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123716675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Timescale Sensitive Movement Technologies: the EnTimeMent project","authors":"A. Camurri","doi":"10.1109/ACIIW.2019.8925207","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925207","url":null,"abstract":"This paper introduces the EU FET PROACTIVE 4-year project EnTimeMent. EnTimeMent aims at a radical change in scientific research and enabling technologies for human movement qualitative analysis, entrainment and prediction, based on a novel neuro-cognitive approach of the multiple, mutually interactive time scales characterizing human behaviour. The main technological breakthrough of EnTimeMent will be promoting novel perspectives on understanding, measuring and predicting the qualities of movement (at individual and group level) in motion capture, multisensory interfaces, wearables, affective and IoT technologies.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125531466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deliberative and Affective Reasoning: a Bayesian Dual-Process Model","authors":"J. Hoey, Z. Sheikhbahaee, N. MacKinnon","doi":"10.1109/ACIIW.2019.8925215","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925215","url":null,"abstract":"The presence of artificial agents in human social networks is growing. From chatbots to robots, human experience in the developed world is moving towards a socio-technical system in which agents can be technological or biological, with increasingly blurred distinctions between. Given that emotion is a key element of human interaction, enabling artificial agents with the ability to reason about affect is a key stepping stone towards a future in which technological agents and humans can work together. This paper presents work on building intelligent computational agents that integrate both emotion and cognition. These agents are grounded in the well-established social-psychological Bayesian Affect Control Theory (BayesAct). The core idea of BayesAct is that humans are motivated in their social interactions by affective alignment: they strive for their social experiences to be coherent at a deep, emotional level with their sense of identity and general world views as constructed through culturally shared symbols. This affective alignment creates cohesive bonds between group members, and is instrumental for collaborations to solidify as relational group commitments. BayesAct agents are motivated in their social interactions by a combination of affective alignment and decision theoretic reasoning, trading the two off as a function of the uncertainty or unpredictability of the situation. This paper provides a high-level view of dual process theories and advances BayesAct as a plausible, computationally tractable model based in social-psychological and sociological theory.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134446988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yoshiki Nakashima, Terumi Umematsu, M. Tsujikawa, Yoshifumi Onishi
{"title":"An Effectiveness Comparison between the Use of Activity State Data and That of Activity Magnitude Data in Chronic Stress Recognition","authors":"Yoshiki Nakashima, Terumi Umematsu, M. Tsujikawa, Yoshifumi Onishi","doi":"10.1109/ACIIW.2019.8925222","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925222","url":null,"abstract":"Our aim is to improve the performance of the early recognition of chronic stress, through more effective monitoring of physiological signals produced as people live their daily lives (as opposed to monitoring during brief examination periods when physical activity is controlled). Physiological signals are influenced not only by responses to stress but also by physical activities, and it is necessary to distinguish between these two types of influence. There are basically two approaches to doing this. One is to separate the signals in terms of states of physical activity, such as sitting, walking, or running (the “Activity State” approach), and the other is to separate the signals in terms of the magnitude of physical activity (the “Activity Magnitude” approach). To determine which approach leads to better stress recognition performance, we performed evaluations using a database of 64 subjects and compared results for the two approaches. Results showed that the “Activity State” approach was, to a statistically significant degree, superior to the “Activity Magnitude” approach in the recognition of chronic stress.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133921381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Noah Jones, Natasha Jaques, Pat Pataranutaporn, Asma Ghandeharioun, Rosalind W. Picard
{"title":"Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation","authors":"Noah Jones, Natasha Jaques, Pat Pataranutaporn, Asma Ghandeharioun, Rosalind W. Picard","doi":"10.1109/ACIIW.2019.8925077","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925077","url":null,"abstract":"Machine learning to infer suicide risk and urgency is applied to a dataset of Reddit users in which the risk and urgency labels were derived from crowdsource consensus. We present the results of machine learning models based on transfer learning from document embeddings trained on large external corpora, and find that they have very high F1 scores (.83 -. 92) in distinguishing which users are labeled as being most at risk of committing suicide. We further show that the document embedding approach outperforms a method based on word importance, where important words were identified by domain experts. Finally, we find, using a Latent Dirichlet Allocation (LDA) topic model, that users labeled at-risk for suicide post about different topics to the rest of Reddit than non-suicidal users.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122224321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Baur, Alexander Heimerl, F. Lingenfelser, E. André
{"title":"I see what you did there: Understanding when to trust a ML model with NOVA","authors":"Tobias Baur, Alexander Heimerl, F. Lingenfelser, E. André","doi":"10.1109/ACIIW.2019.8925214","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925214","url":null,"abstract":"In this demo paper we present NOVA, a machine learning and explanation interface that focuses on the automated analysis of social interactions. NOVA combines Cooperative Machine Learning (CML) and explainable AI (XAI) methods to reduce manual labelling efforts while simultaneously generating an intuitive understanding of the learning process of a classification system. Therefore, NOVA features a semi-automated labelling process in which users are provided with immediate visual feedback on the predictions, which gives insights into the strengths and weaknesses of the underlying classification system. Following an interactive and exploratory workflow, the performance of the model can be improved by manual revision of the predictions.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124848062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui-Ting Hong, Jeng-Lin Li, Chun-Min Chang, Chi-Chun Lee
{"title":"Improving Automatic Pain Level Recognition using Pain Site as an Auxiliary Task","authors":"Hui-Ting Hong, Jeng-Lin Li, Chun-Min Chang, Chi-Chun Lee","doi":"10.1109/ACIIW.2019.8925185","DOIUrl":"https://doi.org/10.1109/ACIIW.2019.8925185","url":null,"abstract":"Pain is an unpleasant sensory and distressing feeling usually induced by physical damages, and the intensity is further modulated by the experienced pain site. Objective assessment of pain is critical in a variety of clinical practices, however, the status quo in medical practices is based solely on self-report. Recent advancements have been observed in automatic assessment of pain using audio-video recordings, but most do not consider the complex clinical dependency between pain level and pain site. In this study, we propose a Task Specific Encoder with Soft Layer Ordering structure (TSEN-SLO) that utilizes a learnable tensor to flexibly share information between pain level and pain site while still keeping the representations of each task in their self-encoding layers to improve pain level recognition. Our network learns from both face and voice data and achieves accuracy of 70% and 48.1% in a binary and ternary self-report pain level classification in a challenging in-the-wild setting. The approach improves a relative of 6.5% and 9.1% compare to previous work on the same dataset. Further analysis also demonstrates the variation in the self-reported pain level as observed in the facial and acoustic features for different pain sites, which points toward a potential relationship between the neural-mechanism behind internal pain sensation and its effect on expressive facial/vocal behaviors.","PeriodicalId":193568,"journal":{"name":"2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127517767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}