A sarcasm detection method based on modality inconsistencies and textual knowledge enhancement

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-05-14 DOI:10.1016/j.asoc.2025.113225

Yuxin Han, Runtao Yang, Mingyu Zhu, Lina Zhang

{"title":"A sarcasm detection method based on modality inconsistencies and textual knowledge enhancement","authors":"Yuxin Han, Runtao Yang, Mingyu Zhu, Lina Zhang","doi":"10.1016/j.asoc.2025.113225","DOIUrl":null,"url":null,"abstract":"<div><div>Sarcasm detection aims to identify emotional tendencies in tweets, which helps governments and enterprises monitor online public opinions. The Twitter platform can create messages, including images and texts. Existing sarcasm detection methods mainly focus on extracting high-level semantic information from images while ignoring textual information. However, previous research has demonstrated that text is more important than images in sentiment analysis tasks. Inspired by this, we reduce the involvement of image information and investigate the sarcasm detection from a textual perspective. First, we divide the text in the primary dataset into pure text and hashtags. The hashtags are fused with high-frequency words in the pure text. Then, considering the differences on the data distribution between the training corpus of Bidirectional Encoder Representation from Transformers (BERT) and the sarcasm detection corpus, we use the Twitter sentiment analysis corpus to further pre-train the BERT model, obtaining the Basic_BERT and Hash_BERT models as feature extractors for the pure text and hashtags. Furthermore, to better play the role of the text in this task, a cross-gate mechanism method is proposed by a cross-attention transformer module and a similarity constraint. The cross-attention transformer module is used to generate a representation of intra-modal and inter-modal fusion while the similarity constraint is used to achieve a balance between the original modal representation and the fused modal representation. On the sarcasm detection dataset, the proposed model achieves an F1-score of 87.22%, an improvement of 3.30% over the most advanced model.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"177 ","pages":"Article 113225"},"PeriodicalIF":6.6000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625005368","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sarcasm detection aims to identify emotional tendencies in tweets, which helps governments and enterprises monitor online public opinions. The Twitter platform can create messages, including images and texts. Existing sarcasm detection methods mainly focus on extracting high-level semantic information from images while ignoring textual information. However, previous research has demonstrated that text is more important than images in sentiment analysis tasks. Inspired by this, we reduce the involvement of image information and investigate the sarcasm detection from a textual perspective. First, we divide the text in the primary dataset into pure text and hashtags. The hashtags are fused with high-frequency words in the pure text. Then, considering the differences on the data distribution between the training corpus of Bidirectional Encoder Representation from Transformers (BERT) and the sarcasm detection corpus, we use the Twitter sentiment analysis corpus to further pre-train the BERT model, obtaining the Basic_BERT and Hash_BERT models as feature extractors for the pure text and hashtags. Furthermore, to better play the role of the text in this task, a cross-gate mechanism method is proposed by a cross-attention transformer module and a similarity constraint. The cross-attention transformer module is used to generate a representation of intra-modal and inter-modal fusion while the similarity constraint is used to achieve a balance between the original modal representation and the fused modal representation. On the sarcasm detection dataset, the proposed model achieves an F1-score of 87.22%, an improvement of 3.30% over the most advanced model.

查看原文本刊更多论文

一种基于情态不一致和文本知识增强的反讽检测方法

讽刺检测旨在识别推文中的情感倾向，帮助政府和企业监控网络舆情。Twitter平台可以创建消息，包括图像和文本。现有的讽刺检测方法主要侧重于从图像中提取高级语义信息，而忽略了文本信息。然而，先前的研究表明，在情感分析任务中，文本比图像更重要。受此启发，我们减少图像信息的介入，从文本的角度研究讽刺的检测。首先，我们将主数据集中的文本分为纯文本和标签。在纯文本中，标签与高频词融合在一起。然后，考虑到变形变压器双向编码器表示（two - directional Encoder Representation from Transformers， BERT）训练语料库与讽刺检测语料库之间数据分布的差异，我们使用Twitter情感分析语料库对BERT模型进行进一步预训练，获得Basic_BERT和Hash_BERT模型作为纯文本和标签的特征提取器。此外，为了更好地发挥文本在该任务中的作用，提出了一种由交叉注意转换模块和相似约束组成的交叉门机制方法。交叉注意转换器模块用于生成模态内和模态间融合的表示，而相似性约束用于实现原始模态表示与融合模态表示之间的平衡。在讽刺检测数据集上，本文提出的模型达到了87.22%的f1得分，比最先进的模型提高了3.30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.