N. Vryzas, Lazaros Vrysis, Rigas Kotsakis, Charalampos A. Dimoulas
{"title":"Speech Emotion Recognition Adapted to Multimodal Semantic Repositories","authors":"N. Vryzas, Lazaros Vrysis, Rigas Kotsakis, Charalampos A. Dimoulas","doi":"10.1109/SMAP.2018.8501881","DOIUrl":null,"url":null,"abstract":"Speech emotion is an important paralinguistic element of speech communication, which undoubtedly involves high level of subjectivity, without concrete modeling of the implicated emotional states. Specifically, sentimental expression varies in great proportions among different spoken languages and persons. The current work is focused on the investigation of emotional states discrimination potentials, in an adaptive/personalized approach, aiming at the creation of an effective multimodal speech emotion recognition service. In this context, an emotional speech ground truth database is formulated, containing semantically/ emotionally \"loaded\" utterances of a certain speaker in five basic sentiments. In the conducted experiments several classification algorithms are implemented and compared to the results of a generalized/ augmented multi-speaker emotional speech database. Furthermore, an audio-based application is designed for real time sentiment identification, while utilizing speech recording tools combined with camera and a Speech-to-Text modules. The audio, video and text files for every spoken utterance are labeled and stored via a user-friendly and functional GUI, for the subsequent augmentation of the personalized database.","PeriodicalId":291905,"journal":{"name":"2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMAP.2018.8501881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Speech emotion is an important paralinguistic element of speech communication, which undoubtedly involves high level of subjectivity, without concrete modeling of the implicated emotional states. Specifically, sentimental expression varies in great proportions among different spoken languages and persons. The current work is focused on the investigation of emotional states discrimination potentials, in an adaptive/personalized approach, aiming at the creation of an effective multimodal speech emotion recognition service. In this context, an emotional speech ground truth database is formulated, containing semantically/ emotionally "loaded" utterances of a certain speaker in five basic sentiments. In the conducted experiments several classification algorithms are implemented and compared to the results of a generalized/ augmented multi-speaker emotional speech database. Furthermore, an audio-based application is designed for real time sentiment identification, while utilizing speech recording tools combined with camera and a Speech-to-Text modules. The audio, video and text files for every spoken utterance are labeled and stored via a user-friendly and functional GUI, for the subsequent augmentation of the personalized database.