{"title":"Sparse Autoencoder with Attention Mechanism for Speech Emotion Recognition","authors":"Ting-Wei Sun, A. Wu","doi":"10.1109/AICAS.2019.8771593","DOIUrl":null,"url":null,"abstract":"There has been a lot of previous works on speech emotion with machine learning method. However, most of them rely on the effectiveness of labelled speech data. In this paper, we propose a novel algorithm which combines both sparse autoencoder and attention mechanism. The aim is to benefit from labeled and unlabeled data with autoencoder, and to apply attention mechanism to focus on speech frames which have strong emotional information. We can also ignore other speech frames which do not carry emotional content. The proposed algorithm is evaluated on three public databases with cross-language system. Experimental results show that the proposed algorithm provide significantly higher accurate predictions compare to existing speech emotion recognition algorithms.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS.2019.8771593","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
There has been a lot of previous works on speech emotion with machine learning method. However, most of them rely on the effectiveness of labelled speech data. In this paper, we propose a novel algorithm which combines both sparse autoencoder and attention mechanism. The aim is to benefit from labeled and unlabeled data with autoencoder, and to apply attention mechanism to focus on speech frames which have strong emotional information. We can also ignore other speech frames which do not carry emotional content. The proposed algorithm is evaluated on three public databases with cross-language system. Experimental results show that the proposed algorithm provide significantly higher accurate predictions compare to existing speech emotion recognition algorithms.