多模态情感识别:提取共同和特定模态信息

Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang
{"title":"多模态情感识别:提取共同和特定模态信息","authors":"Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang","doi":"10.1145/3274783.3275200","DOIUrl":null,"url":null,"abstract":"Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.","PeriodicalId":156307,"journal":{"name":"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems","volume":"331 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Multimodal Emotion Recognition by extracting common and modality-specific information\",\"authors\":\"Wei Zhang, Weixi Gu, Fei Ma, S. Ni, Lin Zhang, Shao-Lun Huang\",\"doi\":\"10.1145/3274783.3275200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.\",\"PeriodicalId\":156307,\"journal\":{\"name\":\"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems\",\"volume\":\"331 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3274783.3275200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3274783.3275200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

情感识别技术已广泛应用于广告、医疗保健和在线教育等众多领域。以往的作品通常从声音或视觉信号中识别情感,表现不理想,应用有限。为了提高推理能力,我们提出了一种多模态情感识别模型EMOdal。EMOdal除了分别学习音频和视觉数据外,还能有效地学习两种信号下的公共信息和特定于模态的信息,从而提高推理能力。该模型已经在我们的大规模情感数据集上进行了评估。综合评价表明,我们的模型优于传统方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multimodal Emotion Recognition by extracting common and modality-specific information
Emotion recognition technologies have been widely used in numerous areas including advertising, healthcare and online education. Previous works usually recognize the emotion from either the acoustic or the visual signal, yielding unsatisfied performances and limited applications. To improve the inference capability, we present a multimodal emotion recognition model, EMOdal. Apart from learning the audio and visual data respectively, EMOdal efficiently learns the common and modality-specific information underlying the two kinds of signals, and therefore improves the inference ability. The model has been evaluated on our large-scale emotional data set. The comprehensive evaluations demonstrate that our model outperforms traditional approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信