基于跨模块融合的多模态情感分析

敬伟 张, 彦松 李
{"title":"基于跨模块融合的多模态情感分析","authors":"敬伟 张, 彦松 李","doi":"10.55375/cps.2023.3.2","DOIUrl":null,"url":null,"abstract":"理解表达的情感和情绪是多模态情感分析的两个关键因素。人类语言通常是多模态的,包括视觉觉,语音以及文本三个模态,而每个模态又包含众多不同信息,比如文本模态包括基本的语言符号、句法和语言动作等, 语音模态包括:语音、语调以及声音表达等。视觉模态包括姿态特征、身体语言、眼神以及面部表达等信息。因此如何高效融合模态间信息便成为当下多模态情感分析领域的一个热点问题。为此,文章提出一种基于跨模块融合网络模型。该模型利用LSTM网络作为语言、视觉模态的表示子网络,同时利用改进升级的Transformer模型的跨模块融合对两种模态信息进行有效融合;为了验证文章中提出的模型的效果,在IEMOCAP和MOSEI数据集上进行了仔细评估,结果表明,该模型针对情感分类的准确度有所提高。","PeriodicalId":69833,"journal":{"name":"计算机科学","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"计算机科学","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.55375/cps.2023.3.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

理解表达的情感和情绪是多模态情感分析的两个关键因素。人类语言通常是多模态的,包括视觉觉,语音以及文本三个模态,而每个模态又包含众多不同信息,比如文本模态包括基本的语言符号、句法和语言动作等, 语音模态包括:语音、语调以及声音表达等。视觉模态包括姿态特征、身体语言、眼神以及面部表达等信息。因此如何高效融合模态间信息便成为当下多模态情感分析领域的一个热点问题。为此,文章提出一种基于跨模块融合网络模型。该模型利用LSTM网络作为语言、视觉模态的表示子网络,同时利用改进升级的Transformer模型的跨模块融合对两种模态信息进行有效融合;为了验证文章中提出的模型的效果,在IEMOCAP和MOSEI数据集上进行了仔细评估,结果表明,该模型针对情感分类的准确度有所提高。
基于跨模块融合的多模态情感分析
Understanding the emotions and emotions expressed are two key factors in multimodal emotional analysis. Human language is usually multimodal, including three modes: visual perception, speech, and text, and each mode contains numerous different information. For example, text mode includes basic language symbols, syntax, and language actions, while speech mode includes speech, intonation, and voice expression. Visual modalities include information such as posture features, body language, eye contact, and facial expression. Therefore, how to efficiently integrate inter modal information has become a hot topic in the field of multimodal emotion analysis. Therefore, the article proposes a cross module fusion network model. This model utilizes the LSTM network as the representation sub network for language and visual modalities, while utilizing the cross module fusion of the improved and upgraded Transformer model to effectively fuse the two modal information; In order to verify the effectiveness of the model proposed in the article, careful evaluation was conducted on the IEMOCAP and MOSEI datasets, and the results showed that the accuracy of the model for sentiment classification has been improved.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
21143
期刊介绍: Computer Science (CS) was established in 1974, its original title was Computer Applications and Applied Mathematics until 1979. It is sponsored by Chongqing Southwest Information Co., Ltd, and is the member Journal of CCF(China Computer Federation) and the CCF B Class journal. Computer Science (CS) mainly reports the dynamic development, methodologies and techniques involving a wide range, and International advanced research productions of computer science and technology. Computer Science (CS) has been included in many important national and international index databases, such as CSCD,GCJC, CSA ,DOAJ, IC , UPD, JST. Readers of Computer Science (CS) include the students of college, researches and technicists engaged in the field of computer science and technology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信