Using BERT to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Maryam Heidari, James H. Jones
{"title":"Using BERT to Extract Topic-Independent Sentiment Features for Social Media Bot Detection","authors":"Maryam Heidari, James H. Jones","doi":"10.1109/UEMCON51285.2020.9298158","DOIUrl":null,"url":null,"abstract":"Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event, or product. However, this use raises an important question: what percentage of the information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a \"bot\" instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. This paper introduces a new model that uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features for the social media bot detection model. Using a Natural Language Processing approach to derive topic-independent features for the new bot detection model distinguishes this work from previous bot detection models. We achieve 94% accuracy classifying the contents of Cresci data set [1] as generated by a bot or a human, where the most accurate prior work achieved an accuracy of 92%.","PeriodicalId":433609,"journal":{"name":"2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON51285.2020.9298158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 61

Abstract

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event, or product. However, this use raises an important question: what percentage of the information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a "bot" instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. This paper introduces a new model that uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features for the social media bot detection model. Using a Natural Language Processing approach to derive topic-independent features for the new bot detection model distinguishes this work from previous bot detection models. We achieve 94% accuracy classifying the contents of Cresci data set [1] as generated by a bot or a human, where the most accurate prior work achieved an accuracy of 92%.
基于BERT提取话题无关情感特征的社交媒体机器人检测
在流行的社交媒体平台上,数以百万计的关于不同话题和产品的在线帖子被分享。此内容的一个用途是提供关于特定主题、事件或产品的众包信息。然而,这种用法提出了一个重要的问题:通过这些服务提供的信息中有多少是可信的?特别是,其中一些信息可能是由机器(即“bot”)而不是人类生成的吗?机器人可以,而且经常是,故意设计来产生足够的量来扭曲一个主题的明显趋势或立场,然而这些内容的消费者无法轻易区分机器人的帖子和人类的帖子。本文介绍了一种新的模型,该模型使用来自变形金刚的双向编码器表示(Google Bert)对推文进行情感分类,为社交媒体机器人检测模型识别与主题无关的特征。使用自然语言处理方法为新的机器人检测模型派生与主题无关的特征,将这项工作与以前的机器人检测模型区分开来。我们将Cresci数据集[1]的内容分类为机器人或人类生成的准确率达到了94%,其中最准确的先前工作达到了92%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信