Twitter Bot Detection Using Neural Networks and Linguistic Embeddings

IEEE Open Journal of the Computer Society Pub Date : 2023-08-07 DOI:10.1109/OJCS.2023.3302286

Feng Wei;Uyen Trang Nguyen

{"title":"Twitter Bot Detection Using Neural Networks and Linguistic Embeddings","authors":"Feng Wei;Uyen Trang Nguyen","doi":"10.1109/OJCS.2023.3302286","DOIUrl":null,"url":null,"abstract":"Twitter is a web application playing the dual role of online social networking and micro-blogging. The popularity and open structure of Twitter have attracted a large number of automated programs, known as bots. In this article, we propose a Twitter bot detection model using recurrent neural networks, specifically bidirectional lightweight gated recurrent unit (BiLGRU), and linguistic embeddings. To the best of our knowledge, our Twitter bot detection model is the first that does not require any handcrafted features, or prior knowledge or assumptions about account profiles, friendship networks or historical behavior. The proposed model uses only textual content of tweets and linguistic embeddings to classify bot and human accounts on Twitter. Experimental results show that the proposed model performs better or comparably to state-of-the-art Twitter bot detection models while requiring no feature engineering, making it faster and easier to train and deploy in a real network. We also present experimental results that show the performance and computational costs of different types of linguistic embeddings and recurrence network variants for the task of Twitter bot detection. The results will potentially help researchers design high-performance deep-learning models for similar tasks.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"4 ","pages":"218-230"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782664/10016900/10210119.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10210119/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Twitter is a web application playing the dual role of online social networking and micro-blogging. The popularity and open structure of Twitter have attracted a large number of automated programs, known as bots. In this article, we propose a Twitter bot detection model using recurrent neural networks, specifically bidirectional lightweight gated recurrent unit (BiLGRU), and linguistic embeddings. To the best of our knowledge, our Twitter bot detection model is the first that does not require any handcrafted features, or prior knowledge or assumptions about account profiles, friendship networks or historical behavior. The proposed model uses only textual content of tweets and linguistic embeddings to classify bot and human accounts on Twitter. Experimental results show that the proposed model performs better or comparably to state-of-the-art Twitter bot detection models while requiring no feature engineering, making it faster and easier to train and deploy in a real network. We also present experimental results that show the performance and computational costs of different types of linguistic embeddings and recurrence network variants for the task of Twitter bot detection. The results will potentially help researchers design high-performance deep-learning models for similar tasks.

查看原文本刊更多论文

使用神经网络和语言嵌入的Twitter Bot检测

Twitter是一个网络应用程序，扮演着在线社交网络和微博的双重角色。推特的流行和开放结构吸引了大量被称为机器人的自动化程序。在本文中，我们提出了一个使用递归神经网络的Twitter机器人检测模型，特别是双向轻量级门控递归单元（BiLGRU）和语言嵌入。据我们所知，我们的推特机器人检测模型是第一个不需要任何手工制作的功能，也不需要关于账户档案、友谊网络或历史行为的先验知识或假设的模型。所提出的模型仅使用推文的文本内容和语言嵌入来对推特上的机器人和人类账户进行分类。实验结果表明，所提出的模型在不需要特征工程的情况下，性能优于或可与最先进的Twitter机器人检测模型相比较，使其在真实网络中更快、更容易地进行训练和部署。我们还提供了实验结果，显示了不同类型的语言嵌入和递归网络变体在Twitter机器人检测任务中的性能和计算成本。研究结果可能有助于研究人员为类似任务设计高性能的深度学习模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Open Journal of the Computer Society

CiteScore

12.60

自引率

0.00%

发文量