Jiali You , Haoran Li , Jiawen Deng , Wei Li , Yuanyuan He , Fuji Ren
{"title":"分层推理增强的少镜头多模态情感分析","authors":"Jiali You , Haoran Li , Jiawen Deng , Wei Li , Yuanyuan He , Fuji Ren","doi":"10.1016/j.neucom.2025.130883","DOIUrl":null,"url":null,"abstract":"<div><div>Few-shot Multimodal Sentiment Analysis (FMSA) aims to predict sentiment with minimal labeled data by integrating multiple modalities, such as text and images. While recent FMSA methods have focused on transforming non-linguistic information (e.g., images) into text and leveraging language models to convert them into few-shot filling tasks, they still struggle to capture the latent sentiment information in image–text pairs. These limitations hinder their effectiveness, particularly in real-world applications where labeled data is scarce. To address these limitations, we propose a novel approach, Hierarchical Reasoning Enhanced Few-shot Multimodal Sentiment Analysis (HRE-FMSA), which consists of three main components: the Hierarchical Reasoning Framework (HRF), the Hierarchical Reasoning Representation Fusion Network (H2RF-Net), and label prediction. Concretely, the HRF module excavates latent sentiment information from image–text pairs at three levels: topic/aspect, opinion, and sentiment. Then, H2RF-Net integrates latent sentiment information with the original image–text pairs to generate a prompt, which is fed into a pre-trained Language Model to obtain the final sentiment type. In the experiment, we conducted comprehensive evaluations on three sentence-level datasets and two aspect-level datasets, demonstrating the effectiveness and applicability of HRE-FMSA.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130883"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Reasoning Enhanced Few-Shot Multimodal Sentiment Analysis\",\"authors\":\"Jiali You , Haoran Li , Jiawen Deng , Wei Li , Yuanyuan He , Fuji Ren\",\"doi\":\"10.1016/j.neucom.2025.130883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Few-shot Multimodal Sentiment Analysis (FMSA) aims to predict sentiment with minimal labeled data by integrating multiple modalities, such as text and images. While recent FMSA methods have focused on transforming non-linguistic information (e.g., images) into text and leveraging language models to convert them into few-shot filling tasks, they still struggle to capture the latent sentiment information in image–text pairs. These limitations hinder their effectiveness, particularly in real-world applications where labeled data is scarce. To address these limitations, we propose a novel approach, Hierarchical Reasoning Enhanced Few-shot Multimodal Sentiment Analysis (HRE-FMSA), which consists of three main components: the Hierarchical Reasoning Framework (HRF), the Hierarchical Reasoning Representation Fusion Network (H2RF-Net), and label prediction. Concretely, the HRF module excavates latent sentiment information from image–text pairs at three levels: topic/aspect, opinion, and sentiment. Then, H2RF-Net integrates latent sentiment information with the original image–text pairs to generate a prompt, which is fed into a pre-trained Language Model to obtain the final sentiment type. In the experiment, we conducted comprehensive evaluations on three sentence-level datasets and two aspect-level datasets, demonstrating the effectiveness and applicability of HRE-FMSA.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"651 \",\"pages\":\"Article 130883\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225015553\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015553","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Few-shot Multimodal Sentiment Analysis (FMSA) aims to predict sentiment with minimal labeled data by integrating multiple modalities, such as text and images. While recent FMSA methods have focused on transforming non-linguistic information (e.g., images) into text and leveraging language models to convert them into few-shot filling tasks, they still struggle to capture the latent sentiment information in image–text pairs. These limitations hinder their effectiveness, particularly in real-world applications where labeled data is scarce. To address these limitations, we propose a novel approach, Hierarchical Reasoning Enhanced Few-shot Multimodal Sentiment Analysis (HRE-FMSA), which consists of three main components: the Hierarchical Reasoning Framework (HRF), the Hierarchical Reasoning Representation Fusion Network (H2RF-Net), and label prediction. Concretely, the HRF module excavates latent sentiment information from image–text pairs at three levels: topic/aspect, opinion, and sentiment. Then, H2RF-Net integrates latent sentiment information with the original image–text pairs to generate a prompt, which is fed into a pre-trained Language Model to obtain the final sentiment type. In the experiment, we conducted comprehensive evaluations on three sentence-level datasets and two aspect-level datasets, demonstrating the effectiveness and applicability of HRE-FMSA.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.