{"title":"面向销售的实时情感识别","authors":"Si-Ahmed Naas, S. Sigg","doi":"10.1109/MSN50589.2020.00096","DOIUrl":null,"url":null,"abstract":"Positive emotion is a pre-condition to any sales contract. Likewise, the ability to perceive the emotions of a customer impacts sales performance.To support emotional perception in buyer-seller interactions, we propose an audio-visual emotion recognition system that can recognize eight emotions: neutral, calm, sad, happy, angry, fearful, surprised, and disgusted. We reduced noise in audio samples and we applied transfer learning for image feature extraction based on a pre-trained deep neural network VGG16. For emotion recognition, we successfully obtained an audio emotion-recognition accuracy of 62.51% and 68% and video emotion-recognition accuracy of 97.13% and 97.77% on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE) datasets respectively. For the combination of the two models, our proposed merging mechanism without re-training achieved an accuracy of close to 100% on both datasets. Finally, we demonstrated our system for a customer satisfaction use case in a real customer-to-salesperson interaction using audio and video models, achieving an average accuracy of 78%.","PeriodicalId":447605,"journal":{"name":"2020 16th International Conference on Mobility, Sensing and Networking (MSN)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Real-time Emotion Recognition for Sales\",\"authors\":\"Si-Ahmed Naas, S. Sigg\",\"doi\":\"10.1109/MSN50589.2020.00096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Positive emotion is a pre-condition to any sales contract. Likewise, the ability to perceive the emotions of a customer impacts sales performance.To support emotional perception in buyer-seller interactions, we propose an audio-visual emotion recognition system that can recognize eight emotions: neutral, calm, sad, happy, angry, fearful, surprised, and disgusted. We reduced noise in audio samples and we applied transfer learning for image feature extraction based on a pre-trained deep neural network VGG16. For emotion recognition, we successfully obtained an audio emotion-recognition accuracy of 62.51% and 68% and video emotion-recognition accuracy of 97.13% and 97.77% on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE) datasets respectively. For the combination of the two models, our proposed merging mechanism without re-training achieved an accuracy of close to 100% on both datasets. Finally, we demonstrated our system for a customer satisfaction use case in a real customer-to-salesperson interaction using audio and video models, achieving an average accuracy of 78%.\",\"PeriodicalId\":447605,\"journal\":{\"name\":\"2020 16th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"138 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 16th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN50589.2020.00096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 16th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN50589.2020.00096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Positive emotion is a pre-condition to any sales contract. Likewise, the ability to perceive the emotions of a customer impacts sales performance.To support emotional perception in buyer-seller interactions, we propose an audio-visual emotion recognition system that can recognize eight emotions: neutral, calm, sad, happy, angry, fearful, surprised, and disgusted. We reduced noise in audio samples and we applied transfer learning for image feature extraction based on a pre-trained deep neural network VGG16. For emotion recognition, we successfully obtained an audio emotion-recognition accuracy of 62.51% and 68% and video emotion-recognition accuracy of 97.13% and 97.77% on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE) datasets respectively. For the combination of the two models, our proposed merging mechanism without re-training achieved an accuracy of close to 100% on both datasets. Finally, we demonstrated our system for a customer satisfaction use case in a real customer-to-salesperson interaction using audio and video models, achieving an average accuracy of 78%.