ETS系统AV+EC 2015挑战赛

Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge Pub Date : 2015-10-26 DOI:10.1145/2808196.2811639

P. Cardinal, N. Dehak, Alessandro Lameiras Koerich, J. Alam, Patrice Boucher

{"title":"ETS系统AV+EC 2015挑战赛","authors":"P. Cardinal, N. Dehak, Alessandro Lameiras Koerich, J. Alam, Patrice Boucher","doi":"10.1145/2808196.2811639","DOIUrl":null,"url":null,"abstract":"This paper presents the system that we have developed for the AV+EC 2015 challenge which is mainly based on deep neural networks (DNNs). We have investigated different options using the audio feature set as a base system. The improvements that were achieved on this specific modality have been applied to other modalities. One of our main findings is that the frame stacking technique improves the quality of the predictions made by our model, and the improvements were also observed in all other modalities. Besides that, we also present a new feature set derived from the cardiac rhythm that were extracted from electrocardiogram readings. Such a new feature set helped us to improve the concordance correlation coefficient from 0.088 to 0.124 (on the development set) for the valence, an improvement of 25%. Finally, the fusion of all modalities has been studied using fusion at feature level using a DNN and at prediction level by training linear and random forest regressors. Both fusion schemes provided promising results.","PeriodicalId":123597,"journal":{"name":"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"ETS System for AV+EC 2015 Challenge\",\"authors\":\"P. Cardinal, N. Dehak, Alessandro Lameiras Koerich, J. Alam, Patrice Boucher\",\"doi\":\"10.1145/2808196.2811639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents the system that we have developed for the AV+EC 2015 challenge which is mainly based on deep neural networks (DNNs). We have investigated different options using the audio feature set as a base system. The improvements that were achieved on this specific modality have been applied to other modalities. One of our main findings is that the frame stacking technique improves the quality of the predictions made by our model, and the improvements were also observed in all other modalities. Besides that, we also present a new feature set derived from the cardiac rhythm that were extracted from electrocardiogram readings. Such a new feature set helped us to improve the concordance correlation coefficient from 0.088 to 0.124 (on the development set) for the valence, an improvement of 25%. Finally, the fusion of all modalities has been studied using fusion at feature level using a DNN and at prediction level by training linear and random forest regressors. Both fusion schemes provided promising results.\",\"PeriodicalId\":123597,\"journal\":{\"name\":\"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge\",\"volume\":\"159 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2808196.2811639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2808196.2811639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

本文介绍了我们为AV+EC 2015挑战赛开发的主要基于深度神经网络(dnn)的系统。我们已经研究了使用音频功能集作为基础系统的不同选项。在这一特定模式上取得的改进已应用于其他模式。我们的主要发现之一是帧叠加技术提高了我们模型所做预测的质量，并且在所有其他模式中也观察到改进。除此之外，我们还提出了一个从心电图读数中提取的心律的新特征集。这样一个新的特征集帮助我们将价的一致性相关系数从0.088提高到0.124(在开发集上)，提高了25%。最后，研究了所有模式的融合，在特征水平上使用DNN进行融合，在预测水平上通过训练线性和随机森林回归器进行融合。两种聚变方案都提供了有希望的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ETS System for AV+EC 2015 Challenge

This paper presents the system that we have developed for the AV+EC 2015 challenge which is mainly based on deep neural networks (DNNs). We have investigated different options using the audio feature set as a base system. The improvements that were achieved on this specific modality have been applied to other modalities. One of our main findings is that the frame stacking technique improves the quality of the predictions made by our model, and the improvements were also observed in all other modalities. Besides that, we also present a new feature set derived from the cardiac rhythm that were extracted from electrocardiogram readings. Such a new feature set helped us to improve the concordance correlation coefficient from 0.088 to 0.124 (on the development set) for the valence, an improvement of 25%. Finally, the fusion of all modalities has been studied using fusion at feature level using a DNN and at prediction level by training linear and random forest regressors. Both fusion schemes provided promising results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge

自引率

0.00%

发文量