{"title":"Unbiased Sampling of Bipartite Graph","authors":"J. Wang, Yuchun Guo","doi":"10.1109/CyberC.2011.63","DOIUrl":null,"url":null,"abstract":"Increasing size of online social networks (OSNs) has given rise to sampling method studies that provide a relatively small but representative sample of large-scale OSNs so that the measurement and analysis burden can be affordable. So far, a number of sampling methods already exist that crawl social graphs. Most of them are suitable for one-mode graph where there is only one type of nodes. Literatures show that Metropolis-Hastings Random Walk (MHRW) produces unbiased samples with better performance than other sampling methods. But there are more and more online social networking sites with two types of nodes, such as Taobao and eBay. Representing these two-mode networks as bipartite graphs, we study the sampling methods for bipartite graphs in this paper. Our contributions include analyze the effectiveness of extending MHRW algorithm to bipartite graphs and making a modification in sampling procedure to improve the stability. Finally, we compare our MHRW sampling algorithm with Random Walk (RW) over the generated bipartite graphs as well as real two-mode network graphs. Simulations show that MHRW outperforms RW over bipartite graphs.","PeriodicalId":227472,"journal":{"name":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberC.2011.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Increasing size of online social networks (OSNs) has given rise to sampling method studies that provide a relatively small but representative sample of large-scale OSNs so that the measurement and analysis burden can be affordable. So far, a number of sampling methods already exist that crawl social graphs. Most of them are suitable for one-mode graph where there is only one type of nodes. Literatures show that Metropolis-Hastings Random Walk (MHRW) produces unbiased samples with better performance than other sampling methods. But there are more and more online social networking sites with two types of nodes, such as Taobao and eBay. Representing these two-mode networks as bipartite graphs, we study the sampling methods for bipartite graphs in this paper. Our contributions include analyze the effectiveness of extending MHRW algorithm to bipartite graphs and making a modification in sampling procedure to improve the stability. Finally, we compare our MHRW sampling algorithm with Random Walk (RW) over the generated bipartite graphs as well as real two-mode network graphs. Simulations show that MHRW outperforms RW over bipartite graphs.
随着在线社交网络规模的不断扩大,抽样方法研究应运而生,这些研究提供了相对较小但具有代表性的大型社交网络样本,从而使测量和分析负担能够负担得起。到目前为止,已经存在许多抓取社交图谱的抽样方法。它们大多适用于只有一种节点类型的单模图。文献表明,Metropolis-Hastings Random Walk (MHRW)产生的无偏样本比其他抽样方法具有更好的性能。但是越来越多的在线社交网站有两种类型的节点,比如淘宝和eBay。将这些双模网络表示为二部图,研究了二部图的采样方法。我们的贡献包括分析了将MHRW算法扩展到二部图的有效性,并对采样过程进行了修改以提高稳定性。最后,我们将我们的MHRW抽样算法与随机漫步(RW)算法在生成的二部图和实际的双模网络图上进行了比较。仿真结果表明,MHRW在二部图上优于RW。