THE EFFECT OF ENGAGEMENT INTENSITY AND LEXICAL RICHNESS IN IDENTIFYING BOT ACCOUNTS ON TWITTER

Isa Inuwa-Dutse, Bello Shehu Bello, Ioannis Korkontzelos, R. Heckel
{"title":"THE EFFECT OF ENGAGEMENT INTENSITY AND LEXICAL RICHNESS IN IDENTIFYING BOT ACCOUNTS ON TWITTER","authors":"Isa Inuwa-Dutse, Bello Shehu Bello, Ioannis Korkontzelos, R. Heckel","doi":"10.33965/IJWI_2018161204","DOIUrl":null,"url":null,"abstract":"The rise in the number of automated or bot accounts on Twitter engaging in manipulative behaviour is of great concern to studies using social media as a primary data source. Many strategies have been proposed and implemented, however, the sophistication and rate of deployment of bot accounts is increasing rapidly. This impedes and limits the capabilities of detecting bot strategies. Various features broadly related to account profiles, tweet content, network and temporal patterns have been utilised in detection systems. Tweet content has been proven instrumental in this process, but limited to the terms and entities occurring. Given a set of tweets with no obvious pattern, can we distinguish contents produced by social bots from those of humans? What constitutes engagement on Twitter and how can we measure the intensity of engagement among Twitter users? Can we distinguish between bot and human accounts based on engagement intensity? These are important questions whose answer will improve how detection systems operate to combat malicious activities by effectively distinguishing between human and social bot accounts on Twitter. This study attempts to answer these questions by analysing the engagement intensity and lexical richness of tweets produced by human and social bot accounts using large, diverse datasets. Our results show a clear margin between the two classes in terms of engagement intensity and lexical richness. We found that it is extremely rare for a social bot to engage meaningfully with other users and that lexical features significantly improve the performance of classifying both account types. These are important dimensions to explore toward improving the effectiveness of detection systems in combating the menace of social bot accounts on Twitter.","PeriodicalId":245560,"journal":{"name":"IADIS INTERNATIONAL JOURNAL ON WWW/INTERNET","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IADIS INTERNATIONAL JOURNAL ON WWW/INTERNET","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33965/IJWI_2018161204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

The rise in the number of automated or bot accounts on Twitter engaging in manipulative behaviour is of great concern to studies using social media as a primary data source. Many strategies have been proposed and implemented, however, the sophistication and rate of deployment of bot accounts is increasing rapidly. This impedes and limits the capabilities of detecting bot strategies. Various features broadly related to account profiles, tweet content, network and temporal patterns have been utilised in detection systems. Tweet content has been proven instrumental in this process, but limited to the terms and entities occurring. Given a set of tweets with no obvious pattern, can we distinguish contents produced by social bots from those of humans? What constitutes engagement on Twitter and how can we measure the intensity of engagement among Twitter users? Can we distinguish between bot and human accounts based on engagement intensity? These are important questions whose answer will improve how detection systems operate to combat malicious activities by effectively distinguishing between human and social bot accounts on Twitter. This study attempts to answer these questions by analysing the engagement intensity and lexical richness of tweets produced by human and social bot accounts using large, diverse datasets. Our results show a clear margin between the two classes in terms of engagement intensity and lexical richness. We found that it is extremely rare for a social bot to engage meaningfully with other users and that lexical features significantly improve the performance of classifying both account types. These are important dimensions to explore toward improving the effectiveness of detection systems in combating the menace of social bot accounts on Twitter.
参与强度和词汇丰富度对识别twitter上bot账户的影响
使用社交媒体作为主要数据来源的研究非常关注Twitter上从事操纵行为的自动账户或机器人账户数量的增加。已经提出并实施了许多策略,然而,机器人帐户的复杂性和部署速度正在迅速增加。这阻碍和限制了检测机器人策略的能力。在检测系统中使用了与帐户配置文件、tweet内容、网络和时间模式广泛相关的各种特征。Tweet内容在这一过程中被证明是有用的,但仅限于出现的术语和实体。给定一组没有明显模式的推文,我们能区分社交机器人生成的内容和人类生成的内容吗?什么构成了Twitter用户粘性?我们如何衡量Twitter用户粘性的强度?我们能否根据用户参与度区分机器人账户和真人账户?这些都是重要的问题,它们的答案将改善检测系统的运作方式,通过有效区分Twitter上的人类账户和社交机器人账户来打击恶意活动。本研究试图通过分析人类和社交机器人账户使用大型、多样化的数据集生成的推文的参与强度和词汇丰富度来回答这些问题。我们的研究结果显示,在参与强度和词汇丰富度方面,两个班级之间存在明显的差距。我们发现,社交机器人很少与其他用户进行有意义的互动,词汇功能显著提高了对两种账户类型进行分类的性能。这些都是探索提高检测系统在打击Twitter社交机器人账户威胁方面的有效性的重要方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信