{"title":"基于随机漫步的在线社交网络虚假账户检测","authors":"Jinyuan Jia, Binghui Wang, N. Gong","doi":"10.1109/DSN.2017.55","DOIUrl":null,"url":null,"abstract":"Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.","PeriodicalId":426928,"journal":{"name":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"107","resultStr":"{\"title\":\"Random Walk Based Fake Account Detection in Online Social Networks\",\"authors\":\"Jinyuan Jia, Binghui Wang, N. Gong\",\"doi\":\"10.1109/DSN.2017.55\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.\",\"PeriodicalId\":426928,\"journal\":{\"name\":\"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"107\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSN.2017.55\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2017.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Random Walk Based Fake Account Detection in Online Social Networks
Online social networks are known to be vulnerable to the so-called Sybil attack, in which an attacker maintains massive fake accounts (also called Sybils) and uses them to perform various malicious activities. Therefore, Sybil detection is a fundamental security research problem in online social networks. Random walk based methods, which leverage the structure of an online social network to distribute reputation scores for users, have been demonstrated to be promising in certain real-world online social networks. In particular, random walk based methods have three desired features: they can have theoretically guaranteed performance for online social networks that have the fast-mixing property, they are accurate when the social network has strong homophily property, and they can be scalable to large-scale online social networks. However, existing random walk based methods suffer from several key limitations: 1) they can only leverage either labeled benign users or labeled Sybils, but not both, 2) they have limited detection accuracy for weak-homophily social networks, and 3) they are not robust to label noise in the training dataset. In this work, we propose a new random walk based Sybil detection method called SybilWalk. SybilWalk addresses the limitations of existing random walk based methods while maintaining their desired features. We perform both theoretical and empirical evaluations to compare SybilWalk with previous random walk based methods. Theoretically, for online social networks with the fast-mixing property, SybilWalk has a tighter asymptotical bound on the number of Sybils that are falsely accepted into the social network than all existing random walk based methods. Empirically, we compare SybilWalk with previous random walk based methods using both social networks with synthesized Sybils and a large-scale Twitter dataset with real Sybils. Our empirical results demonstrate that 1) SybilWalk is substantially more accurate than existing random walk based methods for weakhomophily social networks, 2) SybilWalk is substantially more robust to label noise than existing random walk based methods, and 3) SybilWalk is as scalable as the most efficient existing random walk based methods. In particular, on the Twitter dataset, SybilWalk achieves a false positive rate of 1.3% and a false negative rate of 17.3%.