Yanjing Wang , Kai Sun , Bin Shi , Hao Wu , Kaihao Zhang , Bo Dong
{"title":"多模态方面级情感分类中对歧义情感的防范","authors":"Yanjing Wang , Kai Sun , Bin Shi , Hao Wu , Kaihao Zhang , Bo Dong","doi":"10.1016/j.ipm.2025.104375","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advances in multimodal learning have achieved state-of-the-art results in aspect-level sentiment classification by leveraging both text and image data. However, images can sometimes contain contradictory sentiment cues or convey complex messages, making it difficult to accurately determine the sentiment expressed in the text. Intuitively, we should only use image data to complement the text if the latter contains ambiguous sentiment or leans toward the neutral polarity. Therefore, instead of trying to forcefully use images as done in prior work, we develop a Guard against Ambiguous Sentiment (GAS) for multimodal aspect-level sentiment classification (MALSC). Built on a pretrained language model, GAS is equipped with a novel “ambiguity learning” strategy that focuses on learning the degree of sentiment ambiguity within the input text. The sentiment ambiguity then serves to determine the extent to which image information should be utilized for accurate sentiment classification. In our experiments with two benchmark twitter datasets, we found that GAS achieves a performance gain of up to 0.98% in macro-F1 score compared to recent methods in the task. Furthermore, we explore the efficacy of large language models (LLMs) in the MALSC task by employing the core ideas behind GAS to design tailored prompts. We show that multimodal LLMs such as LLaVA, when provided with GAS-principled prompts, yields a 2.4% improvement in macro-F1 score for few-shot learning on the MALSC task.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104375"},"PeriodicalIF":6.9000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A guard against ambiguous sentiment for multimodal aspect-level sentiment classification\",\"authors\":\"Yanjing Wang , Kai Sun , Bin Shi , Hao Wu , Kaihao Zhang , Bo Dong\",\"doi\":\"10.1016/j.ipm.2025.104375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent advances in multimodal learning have achieved state-of-the-art results in aspect-level sentiment classification by leveraging both text and image data. However, images can sometimes contain contradictory sentiment cues or convey complex messages, making it difficult to accurately determine the sentiment expressed in the text. Intuitively, we should only use image data to complement the text if the latter contains ambiguous sentiment or leans toward the neutral polarity. Therefore, instead of trying to forcefully use images as done in prior work, we develop a Guard against Ambiguous Sentiment (GAS) for multimodal aspect-level sentiment classification (MALSC). Built on a pretrained language model, GAS is equipped with a novel “ambiguity learning” strategy that focuses on learning the degree of sentiment ambiguity within the input text. The sentiment ambiguity then serves to determine the extent to which image information should be utilized for accurate sentiment classification. In our experiments with two benchmark twitter datasets, we found that GAS achieves a performance gain of up to 0.98% in macro-F1 score compared to recent methods in the task. Furthermore, we explore the efficacy of large language models (LLMs) in the MALSC task by employing the core ideas behind GAS to design tailored prompts. We show that multimodal LLMs such as LLaVA, when provided with GAS-principled prompts, yields a 2.4% improvement in macro-F1 score for few-shot learning on the MALSC task.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"63 2\",\"pages\":\"Article 104375\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325003164\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003164","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
A guard against ambiguous sentiment for multimodal aspect-level sentiment classification
Recent advances in multimodal learning have achieved state-of-the-art results in aspect-level sentiment classification by leveraging both text and image data. However, images can sometimes contain contradictory sentiment cues or convey complex messages, making it difficult to accurately determine the sentiment expressed in the text. Intuitively, we should only use image data to complement the text if the latter contains ambiguous sentiment or leans toward the neutral polarity. Therefore, instead of trying to forcefully use images as done in prior work, we develop a Guard against Ambiguous Sentiment (GAS) for multimodal aspect-level sentiment classification (MALSC). Built on a pretrained language model, GAS is equipped with a novel “ambiguity learning” strategy that focuses on learning the degree of sentiment ambiguity within the input text. The sentiment ambiguity then serves to determine the extent to which image information should be utilized for accurate sentiment classification. In our experiments with two benchmark twitter datasets, we found that GAS achieves a performance gain of up to 0.98% in macro-F1 score compared to recent methods in the task. Furthermore, we explore the efficacy of large language models (LLMs) in the MALSC task by employing the core ideas behind GAS to design tailored prompts. We show that multimodal LLMs such as LLaVA, when provided with GAS-principled prompts, yields a 2.4% improvement in macro-F1 score for few-shot learning on the MALSC task.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.