Vocal Emotion Recognition with Log-Gabor Filters

Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge Pub Date : 2015-10-26 DOI:10.1145/2808196.2811635

Yu Gu, E. Postma, H. Lin

{"title":"Vocal Emotion Recognition with Log-Gabor Filters","authors":"Yu Gu, E. Postma, H. Lin","doi":"10.1145/2808196.2811635","DOIUrl":null,"url":null,"abstract":"Vocal emotion recognition aims to identify the emotional states of speakers by analyzing their speech signal. This paper builds on the work of Ezzat, Bouvrie and Poggio by performing a spectro-temporal analysis of affective vocalizations by decomposing the associated spectrogram with 2D Gabor filters. Based on the previous studies of the emotion expression in voices and the turn out display in spectrogram, we assumed that each vocal emotion has a unique spectro-temporal signature in terms of orientated energy bands which can be detected by properly tuned Gabor filters. We compared the emotion-recognition performances of tuned log-Gabor filters with standard acoustic features. The experimental results show that applying pairs of log-Gabor filters to extract features from the spectrogram yields a performance that matches the performance of an approach based on traditional acoustic features. Their combined emotion recognition performance outperforms state-of-the-art vocal emotion recognition algorithms. This leads us to conclude that tuned log-Gabor filters support the automatic recognition of emotions from speech and may be beneficial to other speech-related tasks.","PeriodicalId":123597,"journal":{"name":"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2808196.2811635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Vocal emotion recognition aims to identify the emotional states of speakers by analyzing their speech signal. This paper builds on the work of Ezzat, Bouvrie and Poggio by performing a spectro-temporal analysis of affective vocalizations by decomposing the associated spectrogram with 2D Gabor filters. Based on the previous studies of the emotion expression in voices and the turn out display in spectrogram, we assumed that each vocal emotion has a unique spectro-temporal signature in terms of orientated energy bands which can be detected by properly tuned Gabor filters. We compared the emotion-recognition performances of tuned log-Gabor filters with standard acoustic features. The experimental results show that applying pairs of log-Gabor filters to extract features from the spectrogram yields a performance that matches the performance of an approach based on traditional acoustic features. Their combined emotion recognition performance outperforms state-of-the-art vocal emotion recognition algorithms. This leads us to conclude that tuned log-Gabor filters support the automatic recognition of emotions from speech and may be beneficial to other speech-related tasks.

查看原文本刊更多论文

Log-Gabor滤波器的声音情感识别

语音情绪识别的目的是通过分析说话人的语音信号来识别说话人的情绪状态。本文建立在Ezzat, Bouvrie和Poggio的工作基础上，通过用2D Gabor滤波器分解相关的频谱图，对情感发声进行光谱-时间分析。基于以往对声音中的情绪表达和频谱图中输出显示的研究，我们假设每种声音情绪在定向能量带方面具有独特的光谱-时间特征，可以通过适当调谐的Gabor滤波器检测到。我们比较了调整后的log-Gabor滤波器与标准声学特征的情绪识别性能。实验结果表明，应用对数- gabor滤波器对从频谱图中提取特征的性能与基于传统声学特征的方法的性能相匹配。它们的综合情感识别性能优于最先进的声音情感识别算法。这使我们得出这样的结论:经过调整的log-Gabor过滤器支持从语音中自动识别情绪，并且可能对其他与语音相关的任务有益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge

自引率

0.00%

发文量