基于自编码器的音频情感识别分类器

Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop Pub Date : 2020-10-15 DOI:10.1145/3423327.3423669

Changzeng Fu, Jiaqi Shi, Chaoran Liu, C. Ishi, H. Ishiguro

{"title":"基于自编码器的音频情感识别分类器","authors":"Changzeng Fu, Jiaqi Shi, Chaoran Liu, C. Ishi, H. Ishiguro","doi":"10.1145/3423327.3423669","DOIUrl":null,"url":null,"abstract":"In recent years, automatic emotion recognition has attracted the attention of researchers because of its great effects and wide implementations in supporting humans' activities. Given that the data about emotions is difficult to collect and organize into a large database like the dataset of text or images, the true distribution would be difficult to be completely covered by the training set, which affects the model's robustness and generalization in subsequent applications. In this paper, we proposed a model, Adversarial Autoencoder-based Classifier (AAEC), that can not only augment the data within real data distribution but also reasonably extend the boundary of the current data distribution to a possible space. Such an extended space would be better to fit the distribution of training and testing sets. In addition to comparing with baseline models, we modified our proposed model into different configurations and conducted a comprehensive self-comparison with audio modality. The results of our experiment show that our proposed model outperforms the baselines.","PeriodicalId":246071,"journal":{"name":"Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"AAEC: An Adversarial Autoencoder-based Classifier for Audio Emotion Recognition\",\"authors\":\"Changzeng Fu, Jiaqi Shi, Chaoran Liu, C. Ishi, H. Ishiguro\",\"doi\":\"10.1145/3423327.3423669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, automatic emotion recognition has attracted the attention of researchers because of its great effects and wide implementations in supporting humans' activities. Given that the data about emotions is difficult to collect and organize into a large database like the dataset of text or images, the true distribution would be difficult to be completely covered by the training set, which affects the model's robustness and generalization in subsequent applications. In this paper, we proposed a model, Adversarial Autoencoder-based Classifier (AAEC), that can not only augment the data within real data distribution but also reasonably extend the boundary of the current data distribution to a possible space. Such an extended space would be better to fit the distribution of training and testing sets. In addition to comparing with baseline models, we modified our proposed model into different configurations and conducted a comprehensive self-comparison with audio modality. The results of our experiment show that our proposed model outperforms the baselines.\",\"PeriodicalId\":246071,\"journal\":{\"name\":\"Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3423327.3423669\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3423327.3423669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

近年来，自动情绪识别因其在支持人类活动方面的巨大作用和广泛应用而受到研究人员的关注。由于情绪数据很难像文本或图像数据集那样被收集和组织成一个大型数据库，训练集很难完全覆盖真实的分布，从而影响模型在后续应用中的鲁棒性和泛化性。本文提出了一种基于Adversarial Autoencoder-based Classifier (AAEC)模型，该模型不仅可以增强真实数据分布中的数据，而且可以将当前数据分布的边界合理地扩展到一个可能的空间。这样的扩展空间更适合训练集和测试集的分布。除了与基线模型进行比较外，我们还将我们提出的模型修改为不同的配置，并与音频模态进行了全面的自我比较。实验结果表明，我们提出的模型优于基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AAEC: An Adversarial Autoencoder-based Classifier for Audio Emotion Recognition

In recent years, automatic emotion recognition has attracted the attention of researchers because of its great effects and wide implementations in supporting humans' activities. Given that the data about emotions is difficult to collect and organize into a large database like the dataset of text or images, the true distribution would be difficult to be completely covered by the training set, which affects the model's robustness and generalization in subsequent applications. In this paper, we proposed a model, Adversarial Autoencoder-based Classifier (AAEC), that can not only augment the data within real data distribution but also reasonably extend the boundary of the current data distribution to a possible space. Such an extended space would be better to fit the distribution of training and testing sets. In addition to comparing with baseline models, we modified our proposed model into different configurations and conducted a comprehensive self-comparison with audio modality. The results of our experiment show that our proposed model outperforms the baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop

自引率

0.00%

发文量