Conditional selection with CNN augmented transformer for multimodal affective analysis

IF 8.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jianwen Wang, Shiping Wang, Shunxin Xiao, Renjie Lin, Mianxiong Dong, Wenzhong Guo
{"title":"Conditional selection with CNN augmented transformer for multimodal affective analysis","authors":"Jianwen Wang,&nbsp;Shiping Wang,&nbsp;Shunxin Xiao,&nbsp;Renjie Lin,&nbsp;Mianxiong Dong,&nbsp;Wenzhong Guo","doi":"10.1049/cit2.12320","DOIUrl":null,"url":null,"abstract":"<p>Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional semantics. The other is fusing complementary cross-modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross-modal attention. As a result, the located nonverbal features are not only salient but also complementary to sentiment words directly. Experimental results show that the authors’ method achieves state-of-the-art performance on several multimodal affective analysis datasets.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"917-931"},"PeriodicalIF":8.4000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12320","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.12320","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional semantics. The other is fusing complementary cross-modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross-modal attention. As a result, the located nonverbal features are not only salient but also complementary to sentiment words directly. Experimental results show that the authors’ method achieves state-of-the-art performance on several multimodal affective analysis datasets.

Abstract Image

利用 CNN 增强变换器进行条件选择,实现多模态情感分析
注意力机制是近年来多模态情感分析的一种成功方法。尽管取得了进步,但在融合语言及其非语言语境信息方面仍存在一些重大挑战。其一是生成与声学和视觉模态相关的稀疏注意系数,这有助于定位关键的情感语义。另一个是融合互补的跨模态表征,以构建多种模态的最佳突出特征组合。本文提出了一种条件变换器融合网络来处理这些问题。首先,作者为变换器模块配备了 CNN 层,以增强对非语言序列中微妙信号模式的检测。其次,利用情感词作为上下文条件来指导跨模态注意力的计算。因此,所定位的非语言特征不仅是突出的,而且是对情感词的直接补充。实验结果表明,作者的方法在多个多模态情感分析数据集上取得了一流的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CAAI Transactions on Intelligence Technology
CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
11.00
自引率
3.90%
发文量
134
审稿时长
35 weeks
期刊介绍: CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信