{"title":"基于语音的情感识别机器学习算法的性能评估","authors":"Biswajeet Sahu, H. Palo, S. Mohanty","doi":"10.1109/INDIACom51348.2021.00004","DOIUrl":null,"url":null,"abstract":"This paper aims to recognize the human expressive states from their voice samples. It intends to extract a few reliable features and combine them intelligently for the said task for effective recognition. Initially, it extracts a few sub-band spectral properties from voice samples containing emotional information. Further, the pitch and its standard deviation along with the log-energy features have been extracted to develop an efficient combinational model. The chosen features are complementary, hence expected to increase the available emotional information. To validate the combinational framework, several Machine Learning Algorithms (MLAs) have been simulated and compared. Among the classifiers, the Random Forest (RF) has outperformed all others in terms of classification accuracy whereas the Decision Tree remains computationally least expensive.","PeriodicalId":415594,"journal":{"name":"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Performance Evaluation of Machine Learning Algorithms for Emotion Recognition through Speech\",\"authors\":\"Biswajeet Sahu, H. Palo, S. Mohanty\",\"doi\":\"10.1109/INDIACom51348.2021.00004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aims to recognize the human expressive states from their voice samples. It intends to extract a few reliable features and combine them intelligently for the said task for effective recognition. Initially, it extracts a few sub-band spectral properties from voice samples containing emotional information. Further, the pitch and its standard deviation along with the log-energy features have been extracted to develop an efficient combinational model. The chosen features are complementary, hence expected to increase the available emotional information. To validate the combinational framework, several Machine Learning Algorithms (MLAs) have been simulated and compared. Among the classifiers, the Random Forest (RF) has outperformed all others in terms of classification accuracy whereas the Decision Tree remains computationally least expensive.\",\"PeriodicalId\":415594,\"journal\":{\"name\":\"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIACom51348.2021.00004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIACom51348.2021.00004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Performance Evaluation of Machine Learning Algorithms for Emotion Recognition through Speech
This paper aims to recognize the human expressive states from their voice samples. It intends to extract a few reliable features and combine them intelligently for the said task for effective recognition. Initially, it extracts a few sub-band spectral properties from voice samples containing emotional information. Further, the pitch and its standard deviation along with the log-energy features have been extracted to develop an efficient combinational model. The chosen features are complementary, hence expected to increase the available emotional information. To validate the combinational framework, several Machine Learning Algorithms (MLAs) have been simulated and compared. Among the classifiers, the Random Forest (RF) has outperformed all others in terms of classification accuracy whereas the Decision Tree remains computationally least expensive.