预测Twitter上社交机器人的易感性

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI) Pub Date : 2013-10-24 DOI:10.1109/IRI.2013.6642447

Randall Wald, T. Khoshgoftaar, Amri Napolitano, Chris Sumner

{"title":"预测Twitter上社交机器人的易感性","authors":"Randall Wald, T. Khoshgoftaar, Amri Napolitano, Chris Sumner","doi":"10.1109/IRI.2013.6642447","DOIUrl":null,"url":null,"abstract":"The popularity of the Twitter social networking site has made it a target for social bots, which use increasingly-complex algorithms to engage users and pretend to be humans. While much research has studied how to identify such bots in the process of spam detection, little research has looked at the other side of the question - detecting users likely to be fooled by bots. In this paper, we examine a dataset consisting of 610 users who were messaged by Twitter bots, and determine which features describing these users were most helpful in predicting whether or not they would interact with the bots (through replies or following the bot). We then use six classifiers to build models for predicting whether a given user will interact with the bot, both using the selected features and using all features. We find that a users' Klout score, friends count, and followers count are most predictive of whether a user will interact with a bot, and that the Random Forest algorithm produces the best classifier, when used in conjunction with one of the better feature ranking algorithms (although poor feature ranking can actually make performance worse than no feature ranking). Overall, these results show promise for helping understand which users are most vulnerable to social bots.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"471 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":"{\"title\":\"Predicting susceptibility to social bots on Twitter\",\"authors\":\"Randall Wald, T. Khoshgoftaar, Amri Napolitano, Chris Sumner\",\"doi\":\"10.1109/IRI.2013.6642447\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The popularity of the Twitter social networking site has made it a target for social bots, which use increasingly-complex algorithms to engage users and pretend to be humans. While much research has studied how to identify such bots in the process of spam detection, little research has looked at the other side of the question - detecting users likely to be fooled by bots. In this paper, we examine a dataset consisting of 610 users who were messaged by Twitter bots, and determine which features describing these users were most helpful in predicting whether or not they would interact with the bots (through replies or following the bot). We then use six classifiers to build models for predicting whether a given user will interact with the bot, both using the selected features and using all features. We find that a users' Klout score, friends count, and followers count are most predictive of whether a user will interact with a bot, and that the Random Forest algorithm produces the best classifier, when used in conjunction with one of the better feature ranking algorithms (although poor feature ranking can actually make performance worse than no feature ranking). Overall, these results show promise for helping understand which users are most vulnerable to social bots.\",\"PeriodicalId\":418492,\"journal\":{\"name\":\"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)\",\"volume\":\"471 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"61\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2013.6642447\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2013.6642447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 61

摘要

Twitter社交网站的流行使其成为社交机器人的目标，社交机器人使用越来越复杂的算法来吸引用户，并假装成人类。虽然很多研究都在研究如何在垃圾邮件检测过程中识别这类机器人，但很少有研究关注问题的另一面——检测可能被机器人欺骗的用户。在本文中，我们检查了由Twitter机器人发送消息的610个用户组成的数据集，并确定描述这些用户的哪些特征对预测他们是否会与机器人互动(通过回复或关注机器人)最有帮助。然后，我们使用六个分类器来构建模型，用于预测给定用户是否会使用选定的特征和使用所有特征与机器人进行交互。我们发现，用户的Klout分数、好友数量和关注者数量最能预测用户是否会与机器人互动，而随机森林算法在与更好的特征排名算法之一结合使用时产生了最好的分类器(尽管糟糕的特征排名实际上比没有特征排名更糟糕)。总的来说，这些结果有望帮助我们了解哪些用户最容易受到社交机器人的攻击。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Predicting susceptibility to social bots on Twitter

The popularity of the Twitter social networking site has made it a target for social bots, which use increasingly-complex algorithms to engage users and pretend to be humans. While much research has studied how to identify such bots in the process of spam detection, little research has looked at the other side of the question - detecting users likely to be fooled by bots. In this paper, we examine a dataset consisting of 610 users who were messaged by Twitter bots, and determine which features describing these users were most helpful in predicting whether or not they would interact with the bots (through replies or following the bot). We then use six classifiers to build models for predicting whether a given user will interact with the bot, both using the selected features and using all features. We find that a users' Klout score, friends count, and followers count are most predictive of whether a user will interact with a bot, and that the Random Forest algorithm produces the best classifier, when used in conjunction with one of the better feature ranking algorithms (although poor feature ranking can actually make performance worse than no feature ranking). Overall, these results show promise for helping understand which users are most vulnerable to social bots.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)

自引率

0.00%

发文量