{"title":"OCAE: Organization-Controlled Autoencoder for Unsupervised Speech Emotion Analysis","authors":"Siwei Wang, Catherine Soladié, R. Séguier","doi":"10.1109/icfsp48124.2019.8938073","DOIUrl":null,"url":null,"abstract":"One of the severe obstacles to speech emotion analysis is the lack of reasonable labelled speech signal. Thus, an important issue to be considered is applying an unsupervised method to generate a representation in low dimension to analyze emotions. Such a representation coming from data needs to be stable and meaningful, just like the 2D or 3D representation of emotions elaborated by psychology. In this paper, we propose a fully unsupervised approach, called Organization-Controlled AutoEncoder (OCAE), combining autoencoder with PCA to build an emotional representation. We utilize the result of PCA on speech features to control the organization of the data in the latent space of autoencoder, through adding an organization loss to the classical objective function. Indeed, PCA can keep the organization of the data, whereas autoencoder leads to better discrimination of the data. By combining both, we can take advantage of each method. The results on Emo-DB and SEMAINE database show that our representation generated in an unsupervised manner is meaningful and stable.","PeriodicalId":162584,"journal":{"name":"2019 5th International Conference on Frontiers of Signal Processing (ICFSP)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 5th International Conference on Frontiers of Signal Processing (ICFSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icfsp48124.2019.8938073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
One of the severe obstacles to speech emotion analysis is the lack of reasonable labelled speech signal. Thus, an important issue to be considered is applying an unsupervised method to generate a representation in low dimension to analyze emotions. Such a representation coming from data needs to be stable and meaningful, just like the 2D or 3D representation of emotions elaborated by psychology. In this paper, we propose a fully unsupervised approach, called Organization-Controlled AutoEncoder (OCAE), combining autoencoder with PCA to build an emotional representation. We utilize the result of PCA on speech features to control the organization of the data in the latent space of autoencoder, through adding an organization loss to the classical objective function. Indeed, PCA can keep the organization of the data, whereas autoencoder leads to better discrimination of the data. By combining both, we can take advantage of each method. The results on Emo-DB and SEMAINE database show that our representation generated in an unsupervised manner is meaningful and stable.