{"title":"Which is more faithful, seeing or saying? Multimodal sarcasm detection exploiting contrasting sentiment knowledge","authors":"Yutao Chen, Shumin Shi, Heyan Huang","doi":"10.1049/cit2.12400","DOIUrl":null,"url":null,"abstract":"<p>Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly common. However, detecting sarcasm in various forms of communication can be difficult due to conflicting sentiments. In this paper, we introduce a contrasting sentiment-based model for multimodal sarcasm detection (CS4MSD), which identifies inconsistent emotions by leveraging the CLIP knowledge module to produce sentiment features in both text and image. Then, five external sentiments are introduced to prompt the model learning sentimental preferences among modalities. Furthermore, we highlight the importance of verbal descriptions embedded in illustrations and incorporate additional knowledge-sharing modules to fuse such image-like features. Experimental results demonstrate that our model achieves state-of-the-art performance on the public multimodal sarcasm dataset.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"375-386"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12400","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.12400","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly common. However, detecting sarcasm in various forms of communication can be difficult due to conflicting sentiments. In this paper, we introduce a contrasting sentiment-based model for multimodal sarcasm detection (CS4MSD), which identifies inconsistent emotions by leveraging the CLIP knowledge module to produce sentiment features in both text and image. Then, five external sentiments are introduced to prompt the model learning sentimental preferences among modalities. Furthermore, we highlight the importance of verbal descriptions embedded in illustrations and incorporate additional knowledge-sharing modules to fuse such image-like features. Experimental results demonstrate that our model achieves state-of-the-art performance on the public multimodal sarcasm dataset.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.