SIA-Net:用于多模态情感识别的稀疏交互式注意力网络

IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS
Shuzhen Li;Tong Zhang;C. L. Philip Chen
{"title":"SIA-Net:用于多模态情感识别的稀疏交互式注意力网络","authors":"Shuzhen Li;Tong Zhang;C. L. Philip Chen","doi":"10.1109/TCSS.2024.3409715","DOIUrl":null,"url":null,"abstract":"Multimodal emotion recognition (MER) integrates multiple modalities to identify the user's emotional state, which is the core technology of natural and friendly human–computer interaction systems. Currently, many researchers have explored comprehensive multimodal information for MER, but few consider that comprehensive multimodal features may contain noisy, useless, or redundant information, which interferes with emotional feature representation. To tackle this challenge, this article proposes a sparse interactive attention network (SIA-Net) for MER. In SIA-Net, the sparse interactive attention (SIA) module mainly consists of intramodal sparsity and intermodal sparsity. The intramodal sparsity provides sparse but effective unimodal features for multimodal fusion. The intermodal sparsity adaptively sparses intramodal and intermodal interactive relations and encodes them into sparse interactive attention. The sparse interactive attention with a small number of nonzero weights then act on multimodal features to highlight a few but important features and suppress numerous redundant features. Furthermore, the intramodal sparsity and intermodal sparsity are deep sparse representations that make unimodal features and multimodal interactions sparse without complicated optimization. The extensive experimental results show that SIA-Net achieves superior performance on three widely used datasets.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 5","pages":"6782-6794"},"PeriodicalIF":4.5000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SIA-Net: Sparse Interactive Attention Network for Multimodal Emotion Recognition\",\"authors\":\"Shuzhen Li;Tong Zhang;C. L. Philip Chen\",\"doi\":\"10.1109/TCSS.2024.3409715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal emotion recognition (MER) integrates multiple modalities to identify the user's emotional state, which is the core technology of natural and friendly human–computer interaction systems. Currently, many researchers have explored comprehensive multimodal information for MER, but few consider that comprehensive multimodal features may contain noisy, useless, or redundant information, which interferes with emotional feature representation. To tackle this challenge, this article proposes a sparse interactive attention network (SIA-Net) for MER. In SIA-Net, the sparse interactive attention (SIA) module mainly consists of intramodal sparsity and intermodal sparsity. The intramodal sparsity provides sparse but effective unimodal features for multimodal fusion. The intermodal sparsity adaptively sparses intramodal and intermodal interactive relations and encodes them into sparse interactive attention. The sparse interactive attention with a small number of nonzero weights then act on multimodal features to highlight a few but important features and suppress numerous redundant features. Furthermore, the intramodal sparsity and intermodal sparsity are deep sparse representations that make unimodal features and multimodal interactions sparse without complicated optimization. The extensive experimental results show that SIA-Net achieves superior performance on three widely used datasets.\",\"PeriodicalId\":13044,\"journal\":{\"name\":\"IEEE Transactions on Computational Social Systems\",\"volume\":\"11 5\",\"pages\":\"6782-6794\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2024-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computational Social Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10577436/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, CYBERNETICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10577436/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0

摘要

多模态情感识别(MER)整合了多种模态来识别用户的情感状态,是自然友好的人机交互系统的核心技术。目前,许多研究人员都在探索用于 MER 的综合多模态信息,但很少有人考虑到综合多模态特征可能包含噪声、无用或冗余信息,从而干扰情感特征表示。为了应对这一挑战,本文提出了一种用于 MER 的稀疏交互式注意力网络(SIA-Net)。在 SIA-Net 中,稀疏交互式注意力(SIA)模块主要包括模内稀疏性(intramodal sparsity)和模间稀疏性(intermodal sparsity)。模内稀疏性为多模态融合提供稀疏但有效的单模态特征。模式间稀疏性可以自适应地稀疏模式内和模式间的交互关系,并将其编码为稀疏交互式注意力。然后,带有少量非零权重的稀疏交互式注意力会作用于多模态特征,突出少数重要特征,抑制大量冗余特征。此外,模态内稀疏性和模态间稀疏性是一种深度稀疏表征,无需复杂的优化就能使单模态特征和多模态交互变得稀疏。大量实验结果表明,SIA-Net 在三个广泛使用的数据集上取得了优异的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SIA-Net: Sparse Interactive Attention Network for Multimodal Emotion Recognition
Multimodal emotion recognition (MER) integrates multiple modalities to identify the user's emotional state, which is the core technology of natural and friendly human–computer interaction systems. Currently, many researchers have explored comprehensive multimodal information for MER, but few consider that comprehensive multimodal features may contain noisy, useless, or redundant information, which interferes with emotional feature representation. To tackle this challenge, this article proposes a sparse interactive attention network (SIA-Net) for MER. In SIA-Net, the sparse interactive attention (SIA) module mainly consists of intramodal sparsity and intermodal sparsity. The intramodal sparsity provides sparse but effective unimodal features for multimodal fusion. The intermodal sparsity adaptively sparses intramodal and intermodal interactive relations and encodes them into sparse interactive attention. The sparse interactive attention with a small number of nonzero weights then act on multimodal features to highlight a few but important features and suppress numerous redundant features. Furthermore, the intramodal sparsity and intermodal sparsity are deep sparse representations that make unimodal features and multimodal interactions sparse without complicated optimization. The extensive experimental results show that SIA-Net achieves superior performance on three widely used datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Computational Social Systems
IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)
CiteScore
10.00
自引率
20.00%
发文量
316
期刊介绍: IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信