{"title":"Responding with Sentiment Appropriate for the User's Current Sentiment in Dialog as Inferred from Prosody and Gaze Patterns","authors":"Anindita Nath","doi":"10.1145/3242969.3264974","DOIUrl":null,"url":null,"abstract":"Multi-modal sentiment detection from natural video/audio streams has recently received much attention. I propose to use this multi-modal information to develop a technique, Sentiment Coloring , that utilizes the detected sentiments to generate effective responses. In particular, I aim to produce suggested responses colored with sentiment appropriate for that present in the interlocutor's speech. To achieve this, contextual information pertaining to sentiment, extracted from the past as well as the current speech of both the speakers in a dialog, will be utilized. Sentiment, here, includes the three polarities: positive, neutral and negative, as well as other expressions of stance and attitude. Utilizing only the non-verbal cues, namely, prosody and gaze, I will implement two algorithmic approaches and compare their performance in sentiment detection: a simple machine learning algorithm (neural networks), that will act as the baseline, and a deep learning approach, an end-to-end bidirectional LSTM RNN, which is the state-of-the-art in emotion classification. I will build a responsive spoken dialog system(s) with this Sentiment Coloring technique and evaluate the same with human subjects to measure benefits of the technique in various interactive environments.","PeriodicalId":308751,"journal":{"name":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3242969.3264974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Multi-modal sentiment detection from natural video/audio streams has recently received much attention. I propose to use this multi-modal information to develop a technique, Sentiment Coloring , that utilizes the detected sentiments to generate effective responses. In particular, I aim to produce suggested responses colored with sentiment appropriate for that present in the interlocutor's speech. To achieve this, contextual information pertaining to sentiment, extracted from the past as well as the current speech of both the speakers in a dialog, will be utilized. Sentiment, here, includes the three polarities: positive, neutral and negative, as well as other expressions of stance and attitude. Utilizing only the non-verbal cues, namely, prosody and gaze, I will implement two algorithmic approaches and compare their performance in sentiment detection: a simple machine learning algorithm (neural networks), that will act as the baseline, and a deep learning approach, an end-to-end bidirectional LSTM RNN, which is the state-of-the-art in emotion classification. I will build a responsive spoken dialog system(s) with this Sentiment Coloring technique and evaluate the same with human subjects to measure benefits of the technique in various interactive environments.