良好连接子图的社会媒体网络无偏抽样

Dong Wang, Zhenyu Li, Gareth Tyson, Zhenhua Li, Gaogang Xie
{"title":"良好连接子图的社会媒体网络无偏抽样","authors":"Dong Wang, Zhenyu Li, Gareth Tyson, Zhenhua Li, Gaogang Xie","doi":"10.1145/3110025.3110141","DOIUrl":null,"url":null,"abstract":"Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.","PeriodicalId":399660,"journal":{"name":"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unbiased Sampling of Social Media Networks for Well-connected Subgraphs\",\"authors\":\"Dong Wang, Zhenyu Li, Gareth Tyson, Zhenhua Li, Gaogang Xie\",\"doi\":\"10.1145/3110025.3110141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.\",\"PeriodicalId\":399660,\"journal\":{\"name\":\"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3110025.3110141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3110025.3110141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

采样社交图谱对于研究信息扩散等问题至关重要。然而,由于现有的调查算法无法生成良好连接的样本,并且目前基于随机行走的无偏抽样算法采用拒绝抽样,这往往需要费力地获得无偏和良好连接的数据集,这严重影响了性能。提出了一种基于随机漫步的算法,利用虚拟边缘实现无偏采样。它在节点之间注入虚拟边缘,否则步行者在离开这些节点之前会经历过多的排斥。为了方便虚拟边缘的构造和移动概率的计算,我们提出了一种拒绝概率估计算法。最后,我们将USDE应用于两个现实的社交媒体:Twitter和新浪微博。结果表明,USDE生成了连接良好的样本,并且在采样效率和样本质量方面优于现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unbiased Sampling of Social Media Networks for Well-connected Subgraphs
Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信