{"title":"Conversational Speech Emotion Recognition From Indonesian Spoken Language Using Recurrent Neural Network-Based Model","authors":"Aisyah Nurul Izzah Adma, D. Lestari","doi":"10.1109/ICAICTA53211.2021.9640273","DOIUrl":null,"url":null,"abstract":"To achieve natural human-computer interaction, emotional aspects are incorporated in its development. Existing speech emotion recognition studies in the Indonesian language consider utterances as independent entities, ignoring relations among the conversations' utterances. This paper presents the study of conversational speech emotion recognition in Indonesian. We build an RNN-based model that enables utterances to capture contextual information from their surroundings in the same conversation, thus aiding the emotion classifier. We also construct the conversational emotion corpus in the language from the podcast about life experiences to obtain natural emotion on its utterances. Our experiments employ the Long-Short Term Memory (LSTM) and Gated-Recurrent Unit (GRU) algorithms to model the emotion using acoustic and lexical features. Evaluation of the experiment result achieves an F-measure of 58.17% for six emotion classes and an F-measure of 72.52% for four emotion classes by fusing acoustic and lexical contextual features using the LSTM model.","PeriodicalId":217463,"journal":{"name":"2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA53211.2021.9640273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To achieve natural human-computer interaction, emotional aspects are incorporated in its development. Existing speech emotion recognition studies in the Indonesian language consider utterances as independent entities, ignoring relations among the conversations' utterances. This paper presents the study of conversational speech emotion recognition in Indonesian. We build an RNN-based model that enables utterances to capture contextual information from their surroundings in the same conversation, thus aiding the emotion classifier. We also construct the conversational emotion corpus in the language from the podcast about life experiences to obtain natural emotion on its utterances. Our experiments employ the Long-Short Term Memory (LSTM) and Gated-Recurrent Unit (GRU) algorithms to model the emotion using acoustic and lexical features. Evaluation of the experiment result achieves an F-measure of 58.17% for six emotion classes and an F-measure of 72.52% for four emotion classes by fusing acoustic and lexical contextual features using the LSTM model.