S. Bedoya-Jaramillo, E. Belalcázar-Bolaños, T. Villa-Cañas, J. Orozco-Arroyave, J. D. Arias-Londoño, J. Vargas-Bonilla
{"title":"Automatic emotion detection in speech using mel frequency cesptral coefficients","authors":"S. Bedoya-Jaramillo, E. Belalcázar-Bolaños, T. Villa-Cañas, J. Orozco-Arroyave, J. D. Arias-Londoño, J. Vargas-Bonilla","doi":"10.1109/STSIVA.2012.6340558","DOIUrl":null,"url":null,"abstract":"Emotional states produce physiological alterations in the vocal tract introducing variability in the acoustic parameters of speech. Emotion recognition in speech can be used in human-machine interaction applications, speaker verification, analysis of neurological disorders and psychological diagnostic tools. This paper proposes the use of Mel Frequency Cesptral Coefficients (MFCC) for automatic detection of emotions in running speech. Experiments were conducted on the Berlin emotional speech database for a three- class problem (anger, boredom and neutral emotional states). In order to evaluate the discrimination ability of the features three different classifiers were implemented: k-nearest neighbor, Bayesian Linear and quadratic. The highest accuracy results are obtained when neutral and anger emotions are evaluated.","PeriodicalId":383297,"journal":{"name":"2012 XVII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA)","volume":"10863 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 XVII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STSIVA.2012.6340558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Emotional states produce physiological alterations in the vocal tract introducing variability in the acoustic parameters of speech. Emotion recognition in speech can be used in human-machine interaction applications, speaker verification, analysis of neurological disorders and psychological diagnostic tools. This paper proposes the use of Mel Frequency Cesptral Coefficients (MFCC) for automatic detection of emotions in running speech. Experiments were conducted on the Berlin emotional speech database for a three- class problem (anger, boredom and neutral emotional states). In order to evaluate the discrimination ability of the features three different classifiers were implemented: k-nearest neighbor, Bayesian Linear and quadratic. The highest accuracy results are obtained when neutral and anger emotions are evaluated.