CyberVandalism Detection in Wikipedia Using Light Architecture of 1D-CNN

Azha Talal Mohammed Ali, Huda Hallawi, Noor D. Al-Shakarchy
{"title":"CyberVandalism Detection in Wikipedia Using Light Architecture of 1D-CNN","authors":"Azha Talal Mohammed Ali, Huda Hallawi, Noor D. Al-Shakarchy","doi":"10.33640/2405-609x.3321","DOIUrl":null,"url":null,"abstract":"The rapid expansion of human-software-agent interaction has come with new issues. Accordingly, different engage-ments are necessary to adapt to changing human needs in dynamic socio-technical systems. Generally, cybervandalism is the act of leaving any negative impact on any piece of writing in an attempt to modify it. In Wikipedia, vandalism is any attempt to modify an article in a way that negatively affects the article's quality. Recently, several automatic detec-tion techniques and related features have been developed to address this issue. This work introduces a deep learning model with a new and light architecture to detect vandalism in Wikipedia articles. The proposed model employs a one-dimensional convolutional neural network architecture (1D CNN) that can determine the type of modification in Wikipedia articles based on two main stages: the feature extraction stage and the vandalism detection stage, preceded by the data-resampling step, which is used to address class imbalance issues in the dataset. Features are extracted from edits and their associated metadata, as well as new features (reviewers' trust), and then only the salient features are adopted to make a decision about the article; regular or vandalism can contribute to improving the accuracy of predic-tion. The experiments were conducted on a benchmark dataset, the PAN-WVC-2010 corpus, taken from a vandalism detection competition hosted at the CLEF conference. The proposed system, with the new features added, has achieved an accuracy of 100%.","PeriodicalId":17782,"journal":{"name":"Karbala International Journal of Modern Science","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Karbala International Journal of Modern Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33640/2405-609x.3321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid expansion of human-software-agent interaction has come with new issues. Accordingly, different engage-ments are necessary to adapt to changing human needs in dynamic socio-technical systems. Generally, cybervandalism is the act of leaving any negative impact on any piece of writing in an attempt to modify it. In Wikipedia, vandalism is any attempt to modify an article in a way that negatively affects the article's quality. Recently, several automatic detec-tion techniques and related features have been developed to address this issue. This work introduces a deep learning model with a new and light architecture to detect vandalism in Wikipedia articles. The proposed model employs a one-dimensional convolutional neural network architecture (1D CNN) that can determine the type of modification in Wikipedia articles based on two main stages: the feature extraction stage and the vandalism detection stage, preceded by the data-resampling step, which is used to address class imbalance issues in the dataset. Features are extracted from edits and their associated metadata, as well as new features (reviewers' trust), and then only the salient features are adopted to make a decision about the article; regular or vandalism can contribute to improving the accuracy of predic-tion. The experiments were conducted on a benchmark dataset, the PAN-WVC-2010 corpus, taken from a vandalism detection competition hosted at the CLEF conference. The proposed system, with the new features added, has achieved an accuracy of 100%.
基于1D-CNN轻架构的维基百科网络破坏检测
人-软件-代理交互的快速扩展带来了新的问题。因此,不同的参与是必要的,以适应动态社会技术系统中不断变化的人类需求。一般来说,网络破坏行为是在任何一篇文章上留下负面影响并试图修改它的行为。在维基百科中,故意破坏行为是指任何试图以一种负面影响文章质量的方式修改文章的行为。最近,一些自动检测技术和相关功能已经被开发出来来解决这个问题。这项工作引入了一个具有新的轻量级架构的深度学习模型来检测维基百科文章中的破坏行为。该模型采用一维卷积神经网络架构(1D CNN),可以根据两个主要阶段确定维基百科文章中的修改类型:特征提取阶段和破坏检测阶段,然后是数据重采样步骤,用于解决数据集中的类不平衡问题。从编辑及其相关元数据中提取特征,以及新特征(审稿人的信任),然后仅采用显著特征来对文章做出决定;定期或故意破坏有助于提高预测的准确性。实验是在一个基准数据集PAN-WVC-2010语料库上进行的,该语料库取自CLEF会议主办的破坏检测竞赛。该系统添加了新的特征,达到了100%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Karbala International Journal of Modern Science
Karbala International Journal of Modern Science Multidisciplinary-Multidisciplinary
CiteScore
2.50
自引率
0.00%
发文量
54
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信