Moomal Farhad, H. Ismail, S. Harous, M. Masud, A. Beg
{"title":"跨语言语音的情绪识别分析:阿拉伯语、英语和乌尔都语","authors":"Moomal Farhad, H. Ismail, S. Harous, M. Masud, A. Beg","doi":"10.1109/iccakm50778.2021.9357726","DOIUrl":null,"url":null,"abstract":"In a system which involves interaction be- tween machines and humans, the recognition of emotion from audio has always been a focus of research. Emotion recognition can play an essential role in many fields, such as medicine, law, psychology, and customer services. In this paper, we present an empirical comparative analysis of several machine learning classifiers for emotion recognition in audio data. Evaluations are performed for a set of predefined emotions such as happy, sad, and angry from Arabic, English, and Urdu languages. Pitch and cepstral features are extracted from audio files and principal component analysis is applied for dimensionality reduction. Experiments show that random forest outperformed other classifiers on Urdu dataset with an accuracy of 78.75%. However, the performance of Meta iterative classifier on Arabic dataset was better than random forest and neural network with the accuracy of 70%. Classification of emotions on the English dataset, which do not differ much in terms of pitch and MFCC features, generated the lowest accuracies at or below 31%.","PeriodicalId":165854,"journal":{"name":"2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM)","volume":"692 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Analysis of Emotion Recognition from Cross-lingual Speech: Arabic, English, and Urdu\",\"authors\":\"Moomal Farhad, H. Ismail, S. Harous, M. Masud, A. Beg\",\"doi\":\"10.1109/iccakm50778.2021.9357726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In a system which involves interaction be- tween machines and humans, the recognition of emotion from audio has always been a focus of research. Emotion recognition can play an essential role in many fields, such as medicine, law, psychology, and customer services. In this paper, we present an empirical comparative analysis of several machine learning classifiers for emotion recognition in audio data. Evaluations are performed for a set of predefined emotions such as happy, sad, and angry from Arabic, English, and Urdu languages. Pitch and cepstral features are extracted from audio files and principal component analysis is applied for dimensionality reduction. Experiments show that random forest outperformed other classifiers on Urdu dataset with an accuracy of 78.75%. However, the performance of Meta iterative classifier on Arabic dataset was better than random forest and neural network with the accuracy of 70%. Classification of emotions on the English dataset, which do not differ much in terms of pitch and MFCC features, generated the lowest accuracies at or below 31%.\",\"PeriodicalId\":165854,\"journal\":{\"name\":\"2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM)\",\"volume\":\"692 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iccakm50778.2021.9357726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccakm50778.2021.9357726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis of Emotion Recognition from Cross-lingual Speech: Arabic, English, and Urdu
In a system which involves interaction be- tween machines and humans, the recognition of emotion from audio has always been a focus of research. Emotion recognition can play an essential role in many fields, such as medicine, law, psychology, and customer services. In this paper, we present an empirical comparative analysis of several machine learning classifiers for emotion recognition in audio data. Evaluations are performed for a set of predefined emotions such as happy, sad, and angry from Arabic, English, and Urdu languages. Pitch and cepstral features are extracted from audio files and principal component analysis is applied for dimensionality reduction. Experiments show that random forest outperformed other classifiers on Urdu dataset with an accuracy of 78.75%. However, the performance of Meta iterative classifier on Arabic dataset was better than random forest and neural network with the accuracy of 70%. Classification of emotions on the English dataset, which do not differ much in terms of pitch and MFCC features, generated the lowest accuracies at or below 31%.