Manifold knowledge-guided feature fusion network for multimodal sentiment analysis

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-04-09 DOI:10.1016/j.eswa.2025.127537

Xingang Wang , Mengyi Wang , Hai Cui , Yijia Zhang

{"title":"Manifold knowledge-guided feature fusion network for multimodal sentiment analysis","authors":"Xingang Wang , Mengyi Wang , Hai Cui , Yijia Zhang","doi":"10.1016/j.eswa.2025.127537","DOIUrl":null,"url":null,"abstract":"<div><div>With the continuous progress of multimedia and information technology, multimodal sentiment analysis (MSA) has become one of the most advanced and challenging research directions in the field of artificial intelligence. Multimodal data, including text, visual and audio information, provides additional perspectives for sentiment analysis. However, extraneous information in non-verbal modalities affects the accuracy of sentiment analysis, as sentiment-related features are mainly concentrated in changes in mouth movements and pitch changes, which poses a challenge for accurate sentiment analysis. To solve this problem, we propose a manifold knowledge-guided feature fusion network (MKGN). MKGN uses manifold knowledge generated by manifold learning algorithms to guide neural networks to extract effective non-verbal features and establish associations between multiple features while reducing dimensionality. In addition, in order to improve the quality of knowledge, we propose two knowledge enhancement methods: knowledge filter (KF) and knowledge contrastive learning (CL). Among them, KF is used to filter out unreliable knowledge, and CL further strengthens retained knowledge by changing the distance between knowledge. Importantly, the proposed MKGN achieves excellent performance on three datasets compared to state-of-the-art models. On the MOSI dataset, the accuracy is improved by 2% and 1%, respectively. On the MOSEI dataset, the accuracy improved by 3.8% and 1.8%, respectively. On the UR-FUNNY dataset, the accuracy improved by 0.4%.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"280 ","pages":"Article 127537"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425011595","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

With the continuous progress of multimedia and information technology, multimodal sentiment analysis (MSA) has become one of the most advanced and challenging research directions in the field of artificial intelligence. Multimodal data, including text, visual and audio information, provides additional perspectives for sentiment analysis. However, extraneous information in non-verbal modalities affects the accuracy of sentiment analysis, as sentiment-related features are mainly concentrated in changes in mouth movements and pitch changes, which poses a challenge for accurate sentiment analysis. To solve this problem, we propose a manifold knowledge-guided feature fusion network (MKGN). MKGN uses manifold knowledge generated by manifold learning algorithms to guide neural networks to extract effective non-verbal features and establish associations between multiple features while reducing dimensionality. In addition, in order to improve the quality of knowledge, we propose two knowledge enhancement methods: knowledge filter (KF) and knowledge contrastive learning (CL). Among them, KF is used to filter out unreliable knowledge, and CL further strengthens retained knowledge by changing the distance between knowledge. Importantly, the proposed MKGN achieves excellent performance on three datasets compared to state-of-the-art models. On the MOSI dataset, the accuracy is improved by 2% and 1%, respectively. On the MOSEI dataset, the accuracy improved by 3.8% and 1.8%, respectively. On the UR-FUNNY dataset, the accuracy improved by 0.4%.

查看原文本刊更多论文

用于多模态情感分析的表层知识引导的特征融合网络

随着多媒体和信息技术的不断进步，多模态情感分析（MSA）已成为人工智能领域最先进、最具挑战性的研究方向之一。多模态数据，包括文本、视觉和音频信息，为情感分析提供了额外的视角。然而，非语言模态中的外来信息影响了情绪分析的准确性，因为情绪相关特征主要集中在口型变化和音高变化上，这给情绪分析的准确性带来了挑战。为了解决这一问题，我们提出了一种流形知识引导特征融合网络（MKGN）。MKGN利用流形学习算法产生的流形知识引导神经网络提取有效的非语言特征，并在降维的同时建立多个特征之间的关联。此外，为了提高知识质量，我们提出了两种知识增强方法：知识过滤（KF）和知识对比学习（CL）。其中，KF用于过滤掉不可靠的知识，CL通过改变知识之间的距离进一步强化保留知识。重要的是，与最先进的模型相比，所提出的MKGN在三个数据集上实现了出色的性能。在MOSI数据集上，精度分别提高了2%和1%。在MOSEI数据集上，准确率分别提高了3.8%和1.8%。在UR-FUNNY数据集上，准确率提高了0.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.