Yiying Zhang, Nan Zhang, Yiyang Liu, Caixia Ma, Delong Wang
{"title":"基于语音-文本的多模态情感识别方法","authors":"Yiying Zhang, Nan Zhang, Yiyang Liu, Caixia Ma, Delong Wang","doi":"10.1109/ICTech55460.2022.00088","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of low recognition rate and easy to be disturbed by noise in the process of single-mode speech emotion recognition, this paper proposes a speech emotion analysis method based on multi feature fusion of speech and semantics. This method uses opensmile to extract acoustic features and Bi long and short term memory network (Bi-LSTM) to extract semantic features, then carries out feature data fusion, and then inputs the fused data into SVM classification model to obtain the final emotion classification result. This method can effectively solve the shortcomings of single-mode emotion recognition and improve the efficiency and accuracy of recognition.","PeriodicalId":290836,"journal":{"name":"2022 11th International Conference of Information and Communication Technology (ICTech))","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Multimodal Emotion Recognition Method Based on Speech-Text\",\"authors\":\"Yiying Zhang, Nan Zhang, Yiyang Liu, Caixia Ma, Delong Wang\",\"doi\":\"10.1109/ICTech55460.2022.00088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problems of low recognition rate and easy to be disturbed by noise in the process of single-mode speech emotion recognition, this paper proposes a speech emotion analysis method based on multi feature fusion of speech and semantics. This method uses opensmile to extract acoustic features and Bi long and short term memory network (Bi-LSTM) to extract semantic features, then carries out feature data fusion, and then inputs the fused data into SVM classification model to obtain the final emotion classification result. This method can effectively solve the shortcomings of single-mode emotion recognition and improve the efficiency and accuracy of recognition.\",\"PeriodicalId\":290836,\"journal\":{\"name\":\"2022 11th International Conference of Information and Communication Technology (ICTech))\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 11th International Conference of Information and Communication Technology (ICTech))\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTech55460.2022.00088\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference of Information and Communication Technology (ICTech))","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTech55460.2022.00088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Multimodal Emotion Recognition Method Based on Speech-Text
Aiming at the problems of low recognition rate and easy to be disturbed by noise in the process of single-mode speech emotion recognition, this paper proposes a speech emotion analysis method based on multi feature fusion of speech and semantics. This method uses opensmile to extract acoustic features and Bi long and short term memory network (Bi-LSTM) to extract semantic features, then carries out feature data fusion, and then inputs the fused data into SVM classification model to obtain the final emotion classification result. This method can effectively solve the shortcomings of single-mode emotion recognition and improve the efficiency and accuracy of recognition.