基于层次注意策略的多模态情感分析

Proceedings of the conference. Association for Computational Linguistics. Meeting Pub Date : 2018-05-22 DOI:10.18653/v1/P18-1207

Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, I. Marsic

{"title":"基于层次注意策略的多模态情感分析","authors":"Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, I. Marsic","doi":"10.18653/v1/P18-1207","DOIUrl":null,"url":null,"abstract":"Multimodal affective computing, learning to recognize and interpret human affect and subjective information from multiple data sources, is still a challenge because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract levels, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utterance-level sentiment and emotion from text and audio data. Our introduced model outperforms state-of-the-art approaches on published datasets, and we demonstrate that our model is able to visualize and interpret synchronized attention over modalities.","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"68 1","pages":"2225-2235"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"100","resultStr":"{\"title\":\"Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment\",\"authors\":\"Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, I. Marsic\",\"doi\":\"10.18653/v1/P18-1207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal affective computing, learning to recognize and interpret human affect and subjective information from multiple data sources, is still a challenge because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract levels, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utterance-level sentiment and emotion from text and audio data. Our introduced model outperforms state-of-the-art approaches on published datasets, and we demonstrate that our model is able to visualize and interpret synchronized attention over modalities.\",\"PeriodicalId\":74541,\"journal\":{\"name\":\"Proceedings of the conference. Association for Computational Linguistics. Meeting\",\"volume\":\"68 1\",\"pages\":\"2225-2235\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"100\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the conference. Association for Computational Linguistics. Meeting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/P18-1207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the conference. Association for Computational Linguistics. Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/P18-1207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 100

摘要

学习识别和解释来自多个数据源的人类情感和主观信息的多模态情感计算仍然是一个挑战，因为:(i)很难从异构输入中提取信息特征来表示人类情感;(ii)当前的融合策略仅在抽象层面上融合不同的模式，忽略了模式之间依赖于时间的相互作用。为了解决这些问题，我们引入了一种具有注意力和词级融合的分层多模态架构来对文本和音频数据中的话语级情感和情感进行分类。我们引入的模型在已发布的数据集上优于最先进的方法，并且我们证明了我们的模型能够可视化和解释模态上的同步注意力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

Multimodal affective computing, learning to recognize and interpret human affect and subjective information from multiple data sources, is still a challenge because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract levels, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utterance-level sentiment and emotion from text and audio data. Our introduced model outperforms state-of-the-art approaches on published datasets, and we demonstrate that our model is able to visualize and interpret synchronized attention over modalities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the conference. Association for Computational Linguistics. Meeting

自引率

0.00%

发文量