{"title":"Influences of age in emotion recognition of spontaneous speech: A case of an under-resourced language","authors":"N. Jamil, F. Apandi, Raseeda Hamzah","doi":"10.1109/SPED.2017.7990448","DOIUrl":null,"url":null,"abstract":"Recognizing emotions using natural or spontaneous speech are extremely difficult compared to doing the same for acted or elicited speeches. Speech emotion recognition for real conversation such as spontaneous speech requires linguistic information of the speech to be included in the speech emotion recognition component to achieve a high recognition rate. However, with the lack of digital speech resources of an under-resourced language, this requirement poses a problem. In this paper, speech emotion recognition of spontaneous speech in Malay language using prosodic features and Random Forest classifier is presented. We also investigate the influence of age categorized as children, young adults and middle-aged on emotion recognition. Ninety spontaneous speech sentences from 30 native speakers of Malay language are collected and classified into three emotions, which are happy, angry and sad. Results show that the spontaneous speech of middle-aged group achieved the highest accuracy rate followed by children age group and finally the young adults. While sad emotions are recognized satisfactorily across all age groups, confusions exist between happy and angry emotions.","PeriodicalId":345314,"journal":{"name":"2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPED.2017.7990448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Recognizing emotions using natural or spontaneous speech are extremely difficult compared to doing the same for acted or elicited speeches. Speech emotion recognition for real conversation such as spontaneous speech requires linguistic information of the speech to be included in the speech emotion recognition component to achieve a high recognition rate. However, with the lack of digital speech resources of an under-resourced language, this requirement poses a problem. In this paper, speech emotion recognition of spontaneous speech in Malay language using prosodic features and Random Forest classifier is presented. We also investigate the influence of age categorized as children, young adults and middle-aged on emotion recognition. Ninety spontaneous speech sentences from 30 native speakers of Malay language are collected and classified into three emotions, which are happy, angry and sad. Results show that the spontaneous speech of middle-aged group achieved the highest accuracy rate followed by children age group and finally the young adults. While sad emotions are recognized satisfactorily across all age groups, confusions exist between happy and angry emotions.