多模态情感识别:提取共同和特定模态信息

Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems Pub Date : 2018-11-04 DOI:10.1145/3274783.3275200

Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang

{"title":"多模态情感识别:提取共同和特定模态信息","authors":"Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang","doi":"10.1145/3274783.3275200","DOIUrl":null,"url":null,"abstract":"Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.","PeriodicalId":156307,"journal":{"name":"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems","volume":"331 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Multimodal Emotion Recognition by extracting common and modality-specific information\",\"authors\":\"Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang\",\"doi\":\"10.1145/3274783.3275200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.\",\"PeriodicalId\":156307,\"journal\":{\"name\":\"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems\",\"volume\":\"331 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3274783.3275200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3274783.3275200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

情感识别技术已广泛应用于广告、医疗保健和在线教育等众多领域。以往的作品通常从声音或视觉信号中识别情感，表现不理想，应用有限。为了提高推理能力，我们提出了一种多模态情感识别模型EMOdal。EMOdal除了分别学习音频和视觉数据外，还能有效地学习两种信号下的公共信息和特定于模态的信息，从而提高推理能力。该模型已经在我们的大规模情感数据集上进行了评估。综合评价表明，我们的模型优于传统方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multimodal Emotion Recognition by extracting common and modality-specific information

Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems

自引率

0.00%

发文量