{"title":"基于深度学习的情感语音识别","authors":"Othman Omran Khalifa, M. Alhamada, A. Abdalla","doi":"10.29252/MJEE.14.4.39","DOIUrl":null,"url":null,"abstract":"Emotion speech recognition (SER) is to study the formation and change of speaker’s emotional state from his/her speech signal. The main purpose of this field is to produce a convenient system that is able to effortlessly communicate and interact with humans. The reliability of the current speech emotion recognition systems is far from being achieved. However, this is a challenging task due to the gap between acoustic features and human emotions, which rely strongly on the discriminative acoustic features extracted for a given recognition task. Deep Learning techniques have been recently proposed as an alternative to traditional techniques in SER. In this paper, an overview of Deep Learning techniques that could be used in Emotional Speech recognition is presented. Different extracted features like MFCC as well as feature classifications methods like HMM, GMM, LTSTM and ANN were discussion. Also, the review covers databases used, emotions extracted, contributions made toward speech emotion recognition","PeriodicalId":37804,"journal":{"name":"Majlesi Journal of Electrical Engineering","volume":"14 1","pages":"39-55"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Emotional Speech Recognition using Deep Learning\",\"authors\":\"Othman Omran Khalifa, M. Alhamada, A. Abdalla\",\"doi\":\"10.29252/MJEE.14.4.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion speech recognition (SER) is to study the formation and change of speaker’s emotional state from his/her speech signal. The main purpose of this field is to produce a convenient system that is able to effortlessly communicate and interact with humans. The reliability of the current speech emotion recognition systems is far from being achieved. However, this is a challenging task due to the gap between acoustic features and human emotions, which rely strongly on the discriminative acoustic features extracted for a given recognition task. Deep Learning techniques have been recently proposed as an alternative to traditional techniques in SER. In this paper, an overview of Deep Learning techniques that could be used in Emotional Speech recognition is presented. Different extracted features like MFCC as well as feature classifications methods like HMM, GMM, LTSTM and ANN were discussion. Also, the review covers databases used, emotions extracted, contributions made toward speech emotion recognition\",\"PeriodicalId\":37804,\"journal\":{\"name\":\"Majlesi Journal of Electrical Engineering\",\"volume\":\"14 1\",\"pages\":\"39-55\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Majlesi Journal of Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29252/MJEE.14.4.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Majlesi Journal of Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29252/MJEE.14.4.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
Emotion speech recognition (SER) is to study the formation and change of speaker’s emotional state from his/her speech signal. The main purpose of this field is to produce a convenient system that is able to effortlessly communicate and interact with humans. The reliability of the current speech emotion recognition systems is far from being achieved. However, this is a challenging task due to the gap between acoustic features and human emotions, which rely strongly on the discriminative acoustic features extracted for a given recognition task. Deep Learning techniques have been recently proposed as an alternative to traditional techniques in SER. In this paper, an overview of Deep Learning techniques that could be used in Emotional Speech recognition is presented. Different extracted features like MFCC as well as feature classifications methods like HMM, GMM, LTSTM and ANN were discussion. Also, the review covers databases used, emotions extracted, contributions made toward speech emotion recognition
期刊介绍:
The scope of Majlesi Journal of Electrcial Engineering (MJEE) is ranging from mathematical foundation to practical engineering design in all areas of electrical engineering. The editorial board is international and original unpublished papers are welcome from throughout the world. The journal is devoted primarily to research papers, but very high quality survey and tutorial papers are also published. There is no publication charge for the authors.