i-Vector Algorithm with Gaussian Mixture Model for Efficient Speech Emotion Recognition

2015 International Conference on Computational Science and Computational Intelligence (CSCI) Pub Date : 2015-12-07 DOI:10.1109/CSCI.2015.17

Joan Gomes, M. El-Sharkawy

{"title":"i-Vector Algorithm with Gaussian Mixture Model for Efficient Speech Emotion Recognition","authors":"Joan Gomes, M. El-Sharkawy","doi":"10.1109/CSCI.2015.17","DOIUrl":null,"url":null,"abstract":"Emotions constitute an essential part of our existence as it exerts great influence on the physical as well as mental health of people. Emotions often play the role of a sensitive catalyst, which fosters lively interaction between human beings. Over the past few decades the focus of researchers on study of the emotional content of speech signals, has progressively increased. Many systems have been proposed to make the Speech Emotion Recognition (SER) process more correct and accurate. The objective of our research is to classify speech emotion implementing a comparatively new method-i-vector model. i-vector model has found much success in the areas of speaker identification, speech recognition and language identification. But it has not been much explored in recognition of emotion. This paper discusses the design of a speech emotion recognition system considering three important aspects. Firstly, i-vector model was implemented in processing extracted features for speech representation. Secondly, an appropriate classification scheme was designed using Gaussian Mixture Model (GMM), Maximum A Posteriori (MAP) adaptation and i-vector algorithm. Finally, the performance of this new system was evaluated using emotional speech database. Speech emotions were identified with this novel system and also with a conventional system and results were compared, which proved that our proposed system can identify speech emotions with less error and more accuracy.","PeriodicalId":417235,"journal":{"name":"2015 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI.2015.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Emotions constitute an essential part of our existence as it exerts great influence on the physical as well as mental health of people. Emotions often play the role of a sensitive catalyst, which fosters lively interaction between human beings. Over the past few decades the focus of researchers on study of the emotional content of speech signals, has progressively increased. Many systems have been proposed to make the Speech Emotion Recognition (SER) process more correct and accurate. The objective of our research is to classify speech emotion implementing a comparatively new method-i-vector model. i-vector model has found much success in the areas of speaker identification, speech recognition and language identification. But it has not been much explored in recognition of emotion. This paper discusses the design of a speech emotion recognition system considering three important aspects. Firstly, i-vector model was implemented in processing extracted features for speech representation. Secondly, an appropriate classification scheme was designed using Gaussian Mixture Model (GMM), Maximum A Posteriori (MAP) adaptation and i-vector algorithm. Finally, the performance of this new system was evaluated using emotional speech database. Speech emotions were identified with this novel system and also with a conventional system and results were compared, which proved that our proposed system can identify speech emotions with less error and more accuracy.

查看原文本刊更多论文

基于高斯混合模型的i-向量算法高效语音情感识别

情绪构成了我们存在的重要组成部分，因为它对人们的身体和心理健康都有很大的影响。情感往往起到敏感催化剂的作用，促进人与人之间的活跃互动。在过去的几十年里，研究人员对语音信号的情感内容的研究逐渐增加。为了使语音情感识别(SER)过程更加正确和准确，已经提出了许多系统。我们的研究目的是实现一种相对较新的方法-i-向量模型对语音情绪进行分类。向量模型在说话人识别、语音识别和语言识别等领域取得了很大的成功。但在情感识别方面还没有太多的探索。本文从三个重要方面讨论了语音情感识别系统的设计。首先，采用i向量模型对提取的特征进行处理，用于语音表示;其次，利用高斯混合模型(GMM)、MAP (Maximum A Posteriori)自适应和i-vector算法设计了合适的分类方案;最后，利用情感语音数据库对该系统的性能进行了评价。将该系统与传统的语音情绪识别系统进行了对比，结果表明，该系统能够以更小的误差和更高的准确率识别语音情绪。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Conference on Computational Science and Computational Intelligence (CSCI)

自引率

0.00%

发文量