Feature extractions and selection of bot detection on Twitter A systematic literature review: Feature extractions and selection of bot detection on Twitter A systematic literature review

Raad Al-azawi, S. Al-Mamory
{"title":"Feature extractions and selection of bot detection on Twitter A systematic literature review: Feature extractions and selection of bot detection on Twitter A systematic literature review","authors":"Raad Al-azawi, S. Al-Mamory","doi":"10.4114/intartif.vol25iss69pp57-86","DOIUrl":null,"url":null,"abstract":"Abstract Automated or semiautomated computer programs that imitate humans and/or human behavior in online social networks are known as social bots. Users can be attacked by social bots to achieve several hidden aims, such as spreading information or influencing targets. While researchers develop a variety of methods to detect social media bot accounts, attackers adapt their bots to avoid detection. This field necessitates ongoing growth, particularly in the areas of feature selection and extraction. The study's purpose is to provide an overview of bot attacks on Twitter, shedding light on issues in feature extraction and selection that have a significant impact on the accuracy of bot detection algorithms, and highlighting the weaknesses in training time and dimensionality reduction. To the best of our knowledge, this study is the first systematic literature review based on a preset search-strategy that encompasses literature published between 2018 and 2021 which are concerned with Twitter features (attributes). The key findings of this research are threefold. First, the paper provides an improved taxonomy of feature extraction and selection approaches. Second, it includes a comprehensive overview of approaches for detecting bots in the Twitter platform, particularly machine learning techniques. The percentage was calculated using the proposed taxonomy, with metadata, tweet text, and merging (meta and tweet text) accounting for 37%, 31%, and 32%, respectively. Third, some gaps are also highlighted for further research. The first is that public datasets are not precise or suitable in size. Second, the use of integrated systems and real-time detection is uncommon. Third, detecting each bots category identified separately is needed, rather than detecting all categories of bots using one generic model and the same features' values. Finally, extracting influential features that assist machine learning algorithms in detecting Twitter bots with high accuracy is critical, especially if the type of bot is pre-determined.","PeriodicalId":176050,"journal":{"name":"Inteligencia Artif.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inteligencia Artif.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4114/intartif.vol25iss69pp57-86","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Abstract Automated or semiautomated computer programs that imitate humans and/or human behavior in online social networks are known as social bots. Users can be attacked by social bots to achieve several hidden aims, such as spreading information or influencing targets. While researchers develop a variety of methods to detect social media bot accounts, attackers adapt their bots to avoid detection. This field necessitates ongoing growth, particularly in the areas of feature selection and extraction. The study's purpose is to provide an overview of bot attacks on Twitter, shedding light on issues in feature extraction and selection that have a significant impact on the accuracy of bot detection algorithms, and highlighting the weaknesses in training time and dimensionality reduction. To the best of our knowledge, this study is the first systematic literature review based on a preset search-strategy that encompasses literature published between 2018 and 2021 which are concerned with Twitter features (attributes). The key findings of this research are threefold. First, the paper provides an improved taxonomy of feature extraction and selection approaches. Second, it includes a comprehensive overview of approaches for detecting bots in the Twitter platform, particularly machine learning techniques. The percentage was calculated using the proposed taxonomy, with metadata, tweet text, and merging (meta and tweet text) accounting for 37%, 31%, and 32%, respectively. Third, some gaps are also highlighted for further research. The first is that public datasets are not precise or suitable in size. Second, the use of integrated systems and real-time detection is uncommon. Third, detecting each bots category identified separately is needed, rather than detecting all categories of bots using one generic model and the same features' values. Finally, extracting influential features that assist machine learning algorithms in detecting Twitter bots with high accuracy is critical, especially if the type of bot is pre-determined.
Twitter上bot检测的特征提取和选择系统文献综述:Twitter上bot检测的特征提取和选择系统文献综述
在在线社交网络中模仿人类和/或人类行为的自动化或半自动计算机程序被称为社交机器人。用户可能会被社交机器人攻击,以达到几个隐藏的目的,比如传播信息或影响目标。虽然研究人员开发了各种方法来检测社交媒体机器人帐户,但攻击者会调整他们的机器人以避免被检测。这一领域需要持续发展,特别是在特征选择和提取领域。该研究的目的是概述Twitter上的机器人攻击,揭示对机器人检测算法准确性有重大影响的特征提取和选择问题,并突出训练时间和降维方面的弱点。据我们所知,这项研究是第一个基于预设搜索策略的系统文献综述,该策略涵盖了2018年至2021年间发表的与Twitter功能(属性)相关的文献。这项研究的主要发现有三个方面。首先,本文提出了一种改进的分类特征提取和选择方法。其次,它包括对Twitter平台中检测机器人的方法的全面概述,特别是机器学习技术。使用提议的分类法计算百分比,其中元数据、tweet文本和合并(meta和tweet文本)分别占37%、31%和32%。第三,也突出了一些空白,有待进一步研究。首先,公共数据集在规模上不精确或不合适。其次,集成系统和实时检测的使用并不常见。第三,需要单独检测识别的每个机器人类别,而不是使用一个通用模型和相同的特征值检测所有类别的机器人。最后,提取有影响力的特征,帮助机器学习算法高精度地检测Twitter机器人是至关重要的,特别是如果机器人的类型是预先确定的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信