DWAEF: a deep weighted average ensemble framework harnessing novel indicators for sarcasm detection1

IF 2.5 2区计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

EPJ Data Science Pub Date : 2023-08-25 DOI:10.3233/ds-220058

Richa Sharma, Simrat Deol, Udit Kaushish, Prakher Pandey, Vishal Maurya

{"title":"DWAEF: a deep weighted average ensemble framework harnessing novel indicators for sarcasm detection1","authors":"Richa Sharma, Simrat Deol, Udit Kaushish, Prakher Pandey, Vishal Maurya","doi":"10.3233/ds-220058","DOIUrl":null,"url":null,"abstract":"Sarcasm is a linguistic phenomenon often indicating a disparity between literal and inferred meanings. Due to its complexity, it is typically difficult to discern it within an online text message. Consequently, in recent years sarcasm detection has received considerable attention from both academia and industry. Nevertheless, the majority of current approaches simply model low-level indicators of sarcasm in various machine learning algorithms. This paper aims to present sarcasm in a new light by utilizing novel indicators in a deep weighted average ensemble-based framework (DWAEF). The novel indicators pertain to exploiting the presence of simile and metaphor in text and detecting the subtle shift in tone at a sentence’s structural level. A graph neural network (GNN) structure is implemented to detect the presence of simile, bidirectional encoder representations from transformers (BERT) embeddings are exploited to detect metaphorical instances and fuzzy logic is employed to account for the shift of tone. To account for the existence of sarcasm, the DWAEF integrates the inputs from the novel indicators. The performance of the framework is evaluated on a self-curated dataset of online text messages. A comparative report between the results acquired using primitive features and those obtained using a combination of primitive features and proposed indicators is provided. The highest accuracy of 92% was achieved after applying DWAEF, the proposed framework which combines the primitive features and novel indicators together as compared to 78.58% obtained using Support Vector Machine (SVM) which was the lowest among all classifiers.","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"63 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EPJ Data Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ds-220058","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Sarcasm is a linguistic phenomenon often indicating a disparity between literal and inferred meanings. Due to its complexity, it is typically difficult to discern it within an online text message. Consequently, in recent years sarcasm detection has received considerable attention from both academia and industry. Nevertheless, the majority of current approaches simply model low-level indicators of sarcasm in various machine learning algorithms. This paper aims to present sarcasm in a new light by utilizing novel indicators in a deep weighted average ensemble-based framework (DWAEF). The novel indicators pertain to exploiting the presence of simile and metaphor in text and detecting the subtle shift in tone at a sentence’s structural level. A graph neural network (GNN) structure is implemented to detect the presence of simile, bidirectional encoder representations from transformers (BERT) embeddings are exploited to detect metaphorical instances and fuzzy logic is employed to account for the shift of tone. To account for the existence of sarcasm, the DWAEF integrates the inputs from the novel indicators. The performance of the framework is evaluated on a self-curated dataset of online text messages. A comparative report between the results acquired using primitive features and those obtained using a combination of primitive features and proposed indicators is provided. The highest accuracy of 92% was achieved after applying DWAEF, the proposed framework which combines the primitive features and novel indicators together as compared to 78.58% obtained using Support Vector Machine (SVM) which was the lowest among all classifiers.

查看原文本刊更多论文

dwwaef:一种利用新指标进行讽刺检测的深度加权平均集成框架

讽刺是一种语言现象，通常表示字面意义和推断意义之间的差异。由于其复杂性，通常很难在在线短信中识别它。因此，近年来讽刺检测受到了学术界和工业界的广泛关注。然而，目前的大多数方法只是在各种机器学习算法中简单地模拟低级的讽刺指标。本文旨在通过在基于深度加权平均集成的框架(DWAEF)中使用新的指标来呈现讽刺。这些新指标涉及利用语篇中明喻和隐喻的存在，并在句子结构层面检测语气的微妙变化。采用图形神经网络(GNN)结构来检测比喻的存在，利用变压器(BERT)嵌入的双向编码器表示来检测隐喻实例，并采用模糊逻辑来解释音调的移位。为了解释讽刺的存在，DWAEF整合了来自新指标的输入。该框架的性能在一个自策划的在线文本消息数据集上进行了评估。提供了使用原始特征获得的结果与使用原始特征和建议指标组合获得的结果之间的比较报告。采用将原始特征和新指标结合在一起的DWAEF框架后，准确率最高，达到92%，而使用支持向量机(SVM)的准确率为78.58%，是所有分类器中最低的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

EPJ Data Science MATHEMATICS, INTERDISCIPLINARY APPLICATIONS -

CiteScore

6.10

自引率

5.60%

发文量

审稿时长

13 weeks

期刊介绍： EPJ Data Science covers a broad range of research areas and applications and particularly encourages contributions from techno-socio-economic systems, where it comprises those research lines that now regard the digital “tracks” of human beings as first-order objects for scientific investigation. Topics include, but are not limited to, human behavior, social interaction (including animal societies), economic and financial systems, management and business networks, socio-technical infrastructure, health and environmental systems, the science of science, as well as general risk and crisis scenario forecasting up to and including policy advice.