Tijana Đurkić, Aleksandra Lojaničić, S. Suzic, B. Popović, M. Secujski, Tijana V. Nosek
{"title":"基于ML算法的语音情感识别在两个塞尔维亚语数据集上的应用","authors":"Tijana Đurkić, Aleksandra Lojaničić, S. Suzic, B. Popović, M. Secujski, Tijana V. Nosek","doi":"10.1109/TELFOR52709.2021.9653287","DOIUrl":null,"url":null,"abstract":"As machines play an increasing role in people's daily lives, human-machine communication needs to become more similar to communication between two people. For this reason, the need for automatic emotion recognition from speech has arisen. The aim of this paper is to compare the performance of different machine learning algorithms in automatic emotion recognition on two corpora of expressive speech in the Serbian language, one containing speech samples delivered by professional actors, and the other one produced by amateurs. In both cases acoustic features were extracted using the OpenSmile toolkit. The machine learning algorithms under investigation include: k-nearest neighbours, support vector machines and decision trees. The best performance was achieved by support vector machines with dimensionality reduced by principal component analysis. This support was shown to achieve the accuracy of more than 80% for each of 5 analyzed emotions (joy, sadness, fear, anger and neutral) on the amateur speech corpus.","PeriodicalId":330449,"journal":{"name":"2021 29th Telecommunications Forum (TELFOR)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Emotion recognition from speech based on ML algorithms applied on two Serbian datasets\",\"authors\":\"Tijana Đurkić, Aleksandra Lojaničić, S. Suzic, B. Popović, M. Secujski, Tijana V. Nosek\",\"doi\":\"10.1109/TELFOR52709.2021.9653287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As machines play an increasing role in people's daily lives, human-machine communication needs to become more similar to communication between two people. For this reason, the need for automatic emotion recognition from speech has arisen. The aim of this paper is to compare the performance of different machine learning algorithms in automatic emotion recognition on two corpora of expressive speech in the Serbian language, one containing speech samples delivered by professional actors, and the other one produced by amateurs. In both cases acoustic features were extracted using the OpenSmile toolkit. The machine learning algorithms under investigation include: k-nearest neighbours, support vector machines and decision trees. The best performance was achieved by support vector machines with dimensionality reduced by principal component analysis. This support was shown to achieve the accuracy of more than 80% for each of 5 analyzed emotions (joy, sadness, fear, anger and neutral) on the amateur speech corpus.\",\"PeriodicalId\":330449,\"journal\":{\"name\":\"2021 29th Telecommunications Forum (TELFOR)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 29th Telecommunications Forum (TELFOR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TELFOR52709.2021.9653287\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 29th Telecommunications Forum (TELFOR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TELFOR52709.2021.9653287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Emotion recognition from speech based on ML algorithms applied on two Serbian datasets
As machines play an increasing role in people's daily lives, human-machine communication needs to become more similar to communication between two people. For this reason, the need for automatic emotion recognition from speech has arisen. The aim of this paper is to compare the performance of different machine learning algorithms in automatic emotion recognition on two corpora of expressive speech in the Serbian language, one containing speech samples delivered by professional actors, and the other one produced by amateurs. In both cases acoustic features were extracted using the OpenSmile toolkit. The machine learning algorithms under investigation include: k-nearest neighbours, support vector machines and decision trees. The best performance was achieved by support vector machines with dimensionality reduced by principal component analysis. This support was shown to achieve the accuracy of more than 80% for each of 5 analyzed emotions (joy, sadness, fear, anger and neutral) on the amateur speech corpus.