{"title":"良好连接子图的社会媒体网络无偏抽样","authors":"Dong Wang, Zhenyu Li, Gareth Tyson, Zhenhua Li, Gaogang Xie","doi":"10.1145/3110025.3110141","DOIUrl":null,"url":null,"abstract":"Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.","PeriodicalId":399660,"journal":{"name":"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unbiased Sampling of Social Media Networks for Well-connected Subgraphs\",\"authors\":\"Dong Wang, Zhenyu Li, Gareth Tyson, Zhenhua Li, Gaogang Xie\",\"doi\":\"10.1145/3110025.3110141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.\",\"PeriodicalId\":399660,\"journal\":{\"name\":\"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3110025.3110141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3110025.3110141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unbiased Sampling of Social Media Networks for Well-connected Subgraphs
Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.