Detecting deception in Online Social Networks

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) Pub Date : 2014-08-17 DOI:10.1109/ASONAM.2014.6921614

Jalal S. Alowibdi, U. Buy, Philip S. Yu, Leon Stenneth

{"title":"Detecting deception in Online Social Networks","authors":"Jalal S. Alowibdi, U. Buy, Philip S. Yu, Leon Stenneth","doi":"10.1109/ASONAM.2014.6921614","DOIUrl":null,"url":null,"abstract":"Over the past decade Online Social Networks (OSNs) have been helping hundreds of millions of people develop reliable computer-mediated relations. However, many user profiles in OSNs contain misleading, inconsistent or false information. Existing studies have shown that lying in OSNs is quite widespread, often for protecting a user's privacy. In order for OSNs to continue expanding their role as a communication medium in our society, it is crucial for information posted on OSNs to be trusted. Here we define a set of analysis methods for detecting deceptive information about user genders in Twitter. In addition, we report empirical results with our stratified data set consisting of 174,600 Twitter profiles with a 50-50 breakdown between male and female users. Our automated approach compares gender indicators obtained from different profile characteristics including first name, user name, and layout colors. We establish the overall accuracy of each indicator and the strength of all possible values for each indicator through extensive experimentations with our data set. We define male trending users and female trending users based on two factors, namely the overall accuracy of each characteristic and the relative strength of the value of each characteristic for a given user. We apply a Bayesian classifier to the weighted average of characteristics for each user. We flag for possible deception profiles that we classify as male or female in contrast with a self-declared gender that we obtain independently of Twitter profiles. Finally, we use manual inspections on a subset of profiles that we identify as potentially deceptive in order to verify the correctness of our predictions.","PeriodicalId":143584,"journal":{"name":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM.2014.6921614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

Abstract

Over the past decade Online Social Networks (OSNs) have been helping hundreds of millions of people develop reliable computer-mediated relations. However, many user profiles in OSNs contain misleading, inconsistent or false information. Existing studies have shown that lying in OSNs is quite widespread, often for protecting a user's privacy. In order for OSNs to continue expanding their role as a communication medium in our society, it is crucial for information posted on OSNs to be trusted. Here we define a set of analysis methods for detecting deceptive information about user genders in Twitter. In addition, we report empirical results with our stratified data set consisting of 174,600 Twitter profiles with a 50-50 breakdown between male and female users. Our automated approach compares gender indicators obtained from different profile characteristics including first name, user name, and layout colors. We establish the overall accuracy of each indicator and the strength of all possible values for each indicator through extensive experimentations with our data set. We define male trending users and female trending users based on two factors, namely the overall accuracy of each characteristic and the relative strength of the value of each characteristic for a given user. We apply a Bayesian classifier to the weighted average of characteristics for each user. We flag for possible deception profiles that we classify as male or female in contrast with a self-declared gender that we obtain independently of Twitter profiles. Finally, we use manual inspections on a subset of profiles that we identify as potentially deceptive in order to verify the correctness of our predictions.

查看原文本刊更多论文

在线社交网络中的欺骗检测

在过去的十年里，在线社交网络(OSNs)已经帮助数亿人建立了可靠的计算机中介关系。然而，许多用户配置文件中存在误导、不一致或错误的信息。现有的研究表明，在osn中说谎是相当普遍的，通常是为了保护用户的隐私。为了使osn在我们社会中继续发挥通信媒介的作用，在osn上发布的信息必须是可信的。在这里，我们定义了一组分析方法来检测Twitter中关于用户性别的欺骗性信息。此外，我们报告了由174,600个Twitter个人资料组成的分层数据集的实证结果，男女用户之间的细分比例为50-50。我们的自动化方法比较从不同配置文件特征(包括名字、用户名和布局颜色)获得的性别指标。我们通过对我们的数据集进行广泛的实验，建立了每个指标的总体准确性和每个指标所有可能值的强度。我们根据两个因素来定义男性趋势用户和女性趋势用户，即每个特征的总体准确性和每个特征值对给定用户的相对强度。我们应用贝叶斯分类器对每个用户的特征加权平均。我们标记出可能存在欺骗的个人资料，我们将其归类为男性或女性，而不是我们独立于Twitter个人资料获得的自我声明的性别。最后，为了验证我们预测的正确性，我们对我们认为具有潜在欺骗性的概要文件子集使用手动检查。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

自引率

0.00%

发文量