Nimish Ronghe, Sayali S. Nakashe, A. Pawar, S. Bobde
{"title":"Emotion recognition and reaction prediction in videos","authors":"Nimish Ronghe, Sayali S. Nakashe, A. Pawar, S. Bobde","doi":"10.1109/ICRCICN.2017.8234476","DOIUrl":null,"url":null,"abstract":"Facial analysis in videos and images has been a relatively tough task for machine learning models. Recent use of deep learning approaches has demonstrated substantial improvement in results and reliability and can be used for problems such as face recognition, emotion recognition and emotion reaction prediction. In the case of emotion reaction, relevant information of emotions in individual frames often must be aggregated over a variable length sequence of frames and speech signal to produce an appreciable prediction. Emotion reaction prediction is a subset of sequence analysis task and heavily relies on dynamic temporal and spectral features. Convolution neural networks (CNNs) have been extensively used for emotion recognition problems and have produced reliable results. However, they lack the ability to extract time-series information from a sequence of inputs and cannot model an emotion transaction. Recurrent neural networks (RNNs) are being used profoundly due to their ability to yield impressive results on a variety of tasks in the field of sequence analysis. In this work, we propose a system for emotion recognition and reaction prediction in videos. The primary focus is experimental analysis of a hybrid CNN-RNN architecture for emotion transaction analysis that can recognize the emotion in a frame in a video and predict its appropriate reaction.","PeriodicalId":166298,"journal":{"name":"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRCICN.2017.8234476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Facial analysis in videos and images has been a relatively tough task for machine learning models. Recent use of deep learning approaches has demonstrated substantial improvement in results and reliability and can be used for problems such as face recognition, emotion recognition and emotion reaction prediction. In the case of emotion reaction, relevant information of emotions in individual frames often must be aggregated over a variable length sequence of frames and speech signal to produce an appreciable prediction. Emotion reaction prediction is a subset of sequence analysis task and heavily relies on dynamic temporal and spectral features. Convolution neural networks (CNNs) have been extensively used for emotion recognition problems and have produced reliable results. However, they lack the ability to extract time-series information from a sequence of inputs and cannot model an emotion transaction. Recurrent neural networks (RNNs) are being used profoundly due to their ability to yield impressive results on a variety of tasks in the field of sequence analysis. In this work, we propose a system for emotion recognition and reaction prediction in videos. The primary focus is experimental analysis of a hybrid CNN-RNN architecture for emotion transaction analysis that can recognize the emotion in a frame in a video and predict its appropriate reaction.