Attention-Based Context Boosted Cyberbullying Detection in Social Media

J. Data Intell. Pub Date : 2021-11-01 DOI:10.26421/jdi2.4-2

Nabi Rezvani, A. Beheshti

{"title":"Attention-Based Context Boosted Cyberbullying Detection in Social Media","authors":"Nabi Rezvani, A. Beheshti","doi":"10.26421/jdi2.4-2","DOIUrl":null,"url":null,"abstract":"Cyberbullying detection is a rising research topic due to its paramount impact on social media users, especially youngsters and adolescents. While there has been an enormous amount of progress in utilising efficient machine learning and NLP techniques for tackling this task, recent methods have not fully addressed contextualizing the textual content to the highest possible extent. The textual content of social media posts and comments is normally long, noisy and mixed with lots of irrelevant tokens and characters, and therefore utilizing an attention-based approach that can focus on more relevant parts of the text can be quite pertinent. Moreover, social media information is normally multi-modal in nature and may contain various metadata and contextual information that can contribute to enhancing the Cyberbullying prediction system. In this research, we propose a novel machine learning method that, (i) fine tunes a variant of BERT, a deep attention-based language model, which is capable of detecting patterns in long and noisy bodies of text; (ii)~extracts contextual information from multiple sources including metadata information, images and even external knowledge sources and uses these features to complement the learner model; and (iii) efficiently combines textual and contextual features using boosting and a wide-and-deep architecture. We compare our proposed method with state-of-the-art methods and highlight how our approach significantly outperforming the quality of results compared to those methods in most cases.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Data Intell.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26421/jdi2.4-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Cyberbullying detection is a rising research topic due to its paramount impact on social media users, especially youngsters and adolescents. While there has been an enormous amount of progress in utilising efficient machine learning and NLP techniques for tackling this task, recent methods have not fully addressed contextualizing the textual content to the highest possible extent. The textual content of social media posts and comments is normally long, noisy and mixed with lots of irrelevant tokens and characters, and therefore utilizing an attention-based approach that can focus on more relevant parts of the text can be quite pertinent. Moreover, social media information is normally multi-modal in nature and may contain various metadata and contextual information that can contribute to enhancing the Cyberbullying prediction system. In this research, we propose a novel machine learning method that, (i) fine tunes a variant of BERT, a deep attention-based language model, which is capable of detecting patterns in long and noisy bodies of text; (ii)~extracts contextual information from multiple sources including metadata information, images and even external knowledge sources and uses these features to complement the learner model; and (iii) efficiently combines textual and contextual features using boosting and a wide-and-deep architecture. We compare our proposed method with state-of-the-art methods and highlight how our approach significantly outperforming the quality of results compared to those methods in most cases.

查看原文本刊更多论文

基于注意力的情境促进了社交媒体中的网络欺凌检测

网络欺凌检测是一个新兴的研究课题，因为它对社交媒体用户，尤其是青少年的影响很大。虽然在利用高效的机器学习和自然语言处理技术来解决这一任务方面已经取得了巨大的进展，但最近的方法还没有完全解决文本内容的上下文化问题。社交媒体帖子和评论的文本内容通常很长、很嘈杂，并且夹杂着许多不相关的符号和字符，因此使用基于注意力的方法，可以专注于文本中更相关的部分，这可能是非常相关的。此外，社交媒体信息通常是多模态的，可能包含各种元数据和上下文信息，有助于增强网络欺凌预测系统。在这项研究中，我们提出了一种新的机器学习方法，(i)微调BERT的变体，BERT是一种基于深度注意力的语言模型，能够在冗长和嘈杂的文本中检测模式;(ii)~从多个来源提取上下文信息，包括元数据信息、图像甚至外部知识来源，并使用这些特征来补充学习者模型;(3)利用boosting和广深架构有效地结合了文本和上下文特征。我们将我们提出的方法与最先进的方法进行比较，并强调在大多数情况下，与那些方法相比，我们的方法如何显著优于结果质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

J. Data Intell.

自引率

0.00%

发文量