阿拉伯语英语语音情感识别系统

Mai El Seknedy, S. Fawzi
{"title":"阿拉伯语英语语音情感识别系统","authors":"Mai El Seknedy, S. Fawzi","doi":"10.1109/LT58159.2023.10092295","DOIUrl":null,"url":null,"abstract":"The Speech Emotion Recognition (SER) system is an approach to identify individuals' emotions. This is important for human-machine interface applications and for the emerging Metaverse. This work presents a bilingual Arabic-English speech emotion recognition system based on EYASE and RAVDESS datasets. A novel feature set was composed by using spectral and prosodic parameters to obtain high performance at a low computational cost. Different classification models were applied. These machine learning classifiers are Random Forest, Support Vector Machine, Logistic Regression, Multi-Layer Perceptron, and Ensemble learning. The proposed feature set performance was compared to the \"Interspeech 2009\" challenge feature set, which is considered a benchmark in the field. Promising results were obtained using the proposed feature sets. SVM resulted in the best emotion recognition rate and execution performance. The best accuracies achieved were 85% on RADVESS, and 64% on EYASE. Ensemble learning detected the valence emotion with 90% on RADVESS, and 87.6% on EYASE.","PeriodicalId":142898,"journal":{"name":"2023 20th Learning and Technology Conference (L&T)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Arabic English Speech Emotion Recognition System\",\"authors\":\"Mai El Seknedy, S. Fawzi\",\"doi\":\"10.1109/LT58159.2023.10092295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Speech Emotion Recognition (SER) system is an approach to identify individuals' emotions. This is important for human-machine interface applications and for the emerging Metaverse. This work presents a bilingual Arabic-English speech emotion recognition system based on EYASE and RAVDESS datasets. A novel feature set was composed by using spectral and prosodic parameters to obtain high performance at a low computational cost. Different classification models were applied. These machine learning classifiers are Random Forest, Support Vector Machine, Logistic Regression, Multi-Layer Perceptron, and Ensemble learning. The proposed feature set performance was compared to the \\\"Interspeech 2009\\\" challenge feature set, which is considered a benchmark in the field. Promising results were obtained using the proposed feature sets. SVM resulted in the best emotion recognition rate and execution performance. The best accuracies achieved were 85% on RADVESS, and 64% on EYASE. Ensemble learning detected the valence emotion with 90% on RADVESS, and 87.6% on EYASE.\",\"PeriodicalId\":142898,\"journal\":{\"name\":\"2023 20th Learning and Technology Conference (L&T)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 20th Learning and Technology Conference (L&T)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LT58159.2023.10092295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 20th Learning and Technology Conference (L&T)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LT58159.2023.10092295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

语音情绪识别(SER)系统是一种识别个体情绪的方法。这对于人机界面应用程序和新兴的Metaverse非常重要。本文提出了一种基于EYASE和RAVDESS数据集的双语阿拉伯-英语语音情感识别系统。利用谱参数和韵律参数组成新的特征集,以较低的计算成本获得较高的性能。采用了不同的分类模型。这些机器学习分类器是随机森林、支持向量机、逻辑回归、多层感知器和集成学习。将提出的特征集性能与“Interspeech 2009”挑战特征集进行了比较,该特征集被认为是该领域的基准。使用所提出的特征集获得了令人满意的结果。支持向量机的情绪识别率和执行性能最好。在RADVESS上达到的最佳准确率为85%,在EYASE上达到64%。集成学习对效价情绪的检测率为RADVESS的90%,EYASE的87.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Arabic English Speech Emotion Recognition System
The Speech Emotion Recognition (SER) system is an approach to identify individuals' emotions. This is important for human-machine interface applications and for the emerging Metaverse. This work presents a bilingual Arabic-English speech emotion recognition system based on EYASE and RAVDESS datasets. A novel feature set was composed by using spectral and prosodic parameters to obtain high performance at a low computational cost. Different classification models were applied. These machine learning classifiers are Random Forest, Support Vector Machine, Logistic Regression, Multi-Layer Perceptron, and Ensemble learning. The proposed feature set performance was compared to the "Interspeech 2009" challenge feature set, which is considered a benchmark in the field. Promising results were obtained using the proposed feature sets. SVM resulted in the best emotion recognition rate and execution performance. The best accuracies achieved were 85% on RADVESS, and 64% on EYASE. Ensemble learning detected the valence emotion with 90% on RADVESS, and 87.6% on EYASE.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信