A mobile emotion recognition system based on speech signals and facial images

2013 International Computer Science and Engineering Conference (ICSEC) Pub Date : 2013-09-01 DOI:10.1109/ICSEC.2013.6694781

Yu-Hao Wu, Shu-Jing Lin, Don-Lin Yang

{"title":"A mobile emotion recognition system based on speech signals and facial images","authors":"Yu-Hao Wu, Shu-Jing Lin, Don-Lin Yang","doi":"10.1109/ICSEC.2013.6694781","DOIUrl":null,"url":null,"abstract":"Smartphones are used daily for personal and business communications, and they have become a primary medium to capture human emotions. By recognizing the emotions of speakers during a conversation, one can deliver or understand messages better, make successful negotiations, and provide more personal services. Therefore, we developed an emotion recognition system on a mobile platform based on speech signals and facial images. This research has two phases, a training phase and a testing phase. In the first phase, speech signals and facial images are processed through data preprocessing, feature extraction, and SVM classifier construction steps. In the second phase, the participants generated video recordings as test data. These data were transformed for feature extraction and classified into four emotion classes by using the generated classifiers. Feature selection methods were exploited to choose useful features. We proposed an adjustable weighted segmentation method to determine the final results of emotion recognition. Various experiments were performed using real world simulations to evaluate the proposed system. The result showed an average accuracy rate of 87 percent with the highest accuracy rate at 91 percent. Facial images were also used to improve emotion recognition especially during periods of silence in conversations.","PeriodicalId":191620,"journal":{"name":"2013 International Computer Science and Engineering Conference (ICSEC)","volume":"313 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Computer Science and Engineering Conference (ICSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSEC.2013.6694781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Smartphones are used daily for personal and business communications, and they have become a primary medium to capture human emotions. By recognizing the emotions of speakers during a conversation, one can deliver or understand messages better, make successful negotiations, and provide more personal services. Therefore, we developed an emotion recognition system on a mobile platform based on speech signals and facial images. This research has two phases, a training phase and a testing phase. In the first phase, speech signals and facial images are processed through data preprocessing, feature extraction, and SVM classifier construction steps. In the second phase, the participants generated video recordings as test data. These data were transformed for feature extraction and classified into four emotion classes by using the generated classifiers. Feature selection methods were exploited to choose useful features. We proposed an adjustable weighted segmentation method to determine the final results of emotion recognition. Various experiments were performed using real world simulations to evaluate the proposed system. The result showed an average accuracy rate of 87 percent with the highest accuracy rate at 91 percent. Facial images were also used to improve emotion recognition especially during periods of silence in conversations.

查看原文本刊更多论文

一种基于语音信号和面部图像的移动情感识别系统

智能手机每天都被用于个人和商业交流，它们已经成为捕捉人类情感的主要媒介。通过在对话中识别说话者的情绪，可以更好地传达或理解信息，使谈判成功，并提供更多的个人服务。因此，我们开发了一种基于语音信号和面部图像的移动平台情感识别系统。本研究分为两个阶段，训练阶段和测试阶段。第一阶段通过数据预处理、特征提取、SVM分类器构建等步骤对语音信号和人脸图像进行处理。在第二阶段，参与者生成视频记录作为测试数据。对这些数据进行特征提取，并使用生成的分类器将其分为四类。利用特征选择方法选择有用的特征。我们提出了一种可调整的加权分割方法来确定情感识别的最终结果。利用真实世界的模拟进行了各种实验来评估所提出的系统。结果显示，平均准确率为87%，最高准确率为91%。面部图像也被用来提高情绪识别能力，尤其是在谈话中沉默的时候。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 International Computer Science and Engineering Conference (ICSEC)

自引率

0.00%

发文量