Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage

Álvaro Huertas-García, Alejandro Martín, J. Huertas-Tato, David Camacho
{"title":"Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage","authors":"Álvaro Huertas-García, Alejandro Martín, J. Huertas-Tato, David Camacho","doi":"10.48550/arXiv.2212.14727","DOIUrl":null,"url":null,"abstract":"Content moderation is the process of screening and monitoring user-generated content online. It plays a crucial role in stopping content resulting from unacceptable behaviors such as hate speech, harassment, violence against specific groups, terrorism, racism, xenophobia, homophobia, or misogyny, to mention some few, in Online Social Platforms. These platforms make use of a plethora of tools to detect and manage malicious information; however, malicious actors also improve their skills, developing strategies to surpass these barriers and continuing to spread misleading information. Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. In response to this recent ongoing issue, this paper presents an innovative approach to address this linguistic trend in social networks through the simulation of different content evasion techniques and a multilingual Transformer model for content evasion detection. In this way, we share with the rest of the scientific community a multilingual public tool, named\"pyleetspeak\"to generate/simulate in a customizable way the phenomenon of content evasion through automatic word camouflage and a multilingual Named-Entity Recognition (NER) Transformer-based model tuned for its recognition and detection. The multilingual NER model is evaluated in different textual scenarios, detecting different types and mixtures of camouflage techniques, achieving an overall weighted F1 score of 0.8795. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content on social networks, making the fight against information disorders more effective.","PeriodicalId":8218,"journal":{"name":"Appl. Comput. Intell. Soft Comput.","volume":"23 1","pages":"110552"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Comput. Intell. Soft Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2212.14727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Content moderation is the process of screening and monitoring user-generated content online. It plays a crucial role in stopping content resulting from unacceptable behaviors such as hate speech, harassment, violence against specific groups, terrorism, racism, xenophobia, homophobia, or misogyny, to mention some few, in Online Social Platforms. These platforms make use of a plethora of tools to detect and manage malicious information; however, malicious actors also improve their skills, developing strategies to surpass these barriers and continuing to spread misleading information. Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. In response to this recent ongoing issue, this paper presents an innovative approach to address this linguistic trend in social networks through the simulation of different content evasion techniques and a multilingual Transformer model for content evasion detection. In this way, we share with the rest of the scientific community a multilingual public tool, named"pyleetspeak"to generate/simulate in a customizable way the phenomenon of content evasion through automatic word camouflage and a multilingual Named-Entity Recognition (NER) Transformer-based model tuned for its recognition and detection. The multilingual NER model is evaluated in different textual scenarios, detecting different types and mixtures of camouflage techniques, achieving an overall weighted F1 score of 0.8795. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content on social networks, making the fight against information disorders more effective.
打击在线社交网络中的恶意内容审核规避:单词伪装的模拟与检测
内容审核是对在线用户生成内容进行筛选和监控的过程。它在阻止在线社交平台上由不可接受的行为(如仇恨言论、骚扰、针对特定群体的暴力、恐怖主义、种族主义、仇外心理、同性恋恐惧症或厌女症等)产生的内容方面发挥着至关重要的作用。这些平台利用大量的工具来检测和管理恶意信息;然而,恶意行为者也提高了他们的技能,制定了超越这些障碍的策略,并继续传播误导性信息。扭曲和伪装关键字是逃避平台内容审核系统最常用的技术之一。针对这一近期持续的问题,本文提出了一种创新的方法,通过模拟不同的内容规避技术和用于内容规避检测的多语言Transformer模型来解决社交网络中的这一语言趋势。通过这种方式,我们与科学界的其他成员共享一个多语言公共工具,名为“pyleetspeak”,通过自动单词伪装和多语言命名实体识别(NER)转换器模型,以可定制的方式生成/模拟内容逃避现象,以进行识别和检测。在不同的文本场景下对多语言NER模型进行了评估,检测了不同类型和混合的伪装技术,获得了0.8795的总体加权F1分数。本文通过开发多语言工具来模拟和检测社交网络上逃避内容的新方法,为打击恶意信息做出了重大贡献,使打击信息混乱的斗争更加有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信