网络中基于随机漫步的重访抽样:回避老化期和频繁再生。

Q1 Mathematics

Computational Social Networks Pub Date : 2018-01-01 Epub Date: 2018-03-19 DOI:10.1186/s40649-018-0051-0

Konstantin Avrachenkov, Vivek S Borkar, Arun Kadavankandy, Jithin K Sreedharan

{"title":"网络中基于随机漫步的重访抽样:回避老化期和频繁再生。","authors":"Konstantin Avrachenkov, Vivek S Borkar, Arun Kadavankandy, Jithin K Sreedharan","doi":"10.1186/s40649-018-0051-0","DOIUrl":null,"url":null,"abstract":"Background: In the framework of network sampling, random walk (RW) based estimation techniques provide many pragmatic solutions while uncovering the unknown network as little as possible. Despite several theoretical advances in this area, RW based sampling techniques usually make a strong assumption that the samples are in stationary regime, and hence are impelled to leave out the samples collected during the burn-in period.Methods: This work proposes two sampling schemes without burn-in time constraint to estimate the average of an arbitrary function defined on the network nodes, for example, the average age of users in a social network. The central idea of the algorithms lies in exploiting regeneration of RWs at revisits to an aggregated super-node or to a set of nodes, and in strategies to enhance the frequency of such regenerations either by contracting the graph or by making the hitting set larger. Our first algorithm, which is based on reinforcement learning (RL), uses stochastic approximation to derive an estimator. This method can be seen as intermediate between purely stochastic Markov chain Monte Carlo iterations and deterministic relative value iterations. The second algorithm, which we call the Ratio with Tours (RT)-estimator, is a modified form of respondent-driven sampling (RDS) that accommodates the idea of regeneration.Results: We study the methods via simulations on real networks. We observe that the trajectories of RL-estimator are much more stable than those of standard random walk based estimation procedures, and its error performance is comparable to that of respondent-driven sampling (RDS) which has a smaller asymptotic variance than many other estimators. Simulation studies also show that the mean squared error of RT-estimator decays much faster than that of RDS with time.Conclusion: The newly developed RW based estimators (RL- and RT-estimators) allow to avoid burn-in period, provide better control of stability along the sample path, and overall reduce the estimation time. Our estimators can be applied in social and complex networks.","PeriodicalId":52145,"journal":{"name":"Computational Social Networks","volume":"5 1","pages":"4"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s40649-018-0051-0","citationCount":"8","resultStr":"{\"title\":\"Revisiting random walk based sampling in networks: evasion of burn-in period and frequent regenerations.\",\"authors\":\"Konstantin Avrachenkov, Vivek S Borkar, Arun Kadavankandy, Jithin K Sreedharan\",\"doi\":\"10.1186/s40649-018-0051-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: In the framework of network sampling, random walk (RW) based estimation techniques provide many pragmatic solutions while uncovering the unknown network as little as possible. Despite several theoretical advances in this area, RW based sampling techniques usually make a strong assumption that the samples are in stationary regime, and hence are impelled to leave out the samples collected during the burn-in period.Methods: This work proposes two sampling schemes without burn-in time constraint to estimate the average of an arbitrary function defined on the network nodes, for example, the average age of users in a social network. The central idea of the algorithms lies in exploiting regeneration of RWs at revisits to an aggregated super-node or to a set of nodes, and in strategies to enhance the frequency of such regenerations either by contracting the graph or by making the hitting set larger. Our first algorithm, which is based on reinforcement learning (RL), uses stochastic approximation to derive an estimator. This method can be seen as intermediate between purely stochastic Markov chain Monte Carlo iterations and deterministic relative value iterations. The second algorithm, which we call the Ratio with Tours (RT)-estimator, is a modified form of respondent-driven sampling (RDS) that accommodates the idea of regeneration.Results: We study the methods via simulations on real networks. We observe that the trajectories of RL-estimator are much more stable than those of standard random walk based estimation procedures, and its error performance is comparable to that of respondent-driven sampling (RDS) which has a smaller asymptotic variance than many other estimators. Simulation studies also show that the mean squared error of RT-estimator decays much faster than that of RDS with time.Conclusion: The newly developed RW based estimators (RL- and RT-estimators) allow to avoid burn-in period, provide better control of stability along the sample path, and overall reduce the estimation time. Our estimators can be applied in social and complex networks.\",\"PeriodicalId\":52145,\"journal\":{\"name\":\"Computational Social Networks\",\"volume\":\"5 1\",\"pages\":\"4\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/s40649-018-0051-0\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Social Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s40649-018-0051-0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2018/3/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Social Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40649-018-0051-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2018/3/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 8

摘要

背景:在网络采样的框架下，基于随机漫步(RW)的估计技术提供了许多实用的解决方案，同时尽可能少地揭示未知网络。尽管在该领域取得了一些理论进展，但基于RW的采样技术通常强烈假设样品处于平稳状态，因此被迫忽略在老化期间收集的样品。方法:本工作提出了两种不受老化时间约束的采样方案，用于估计网络节点上定义的任意函数的平均值，例如社交网络中用户的平均年龄。该算法的核心思想在于在访问聚合的超级节点或一组节点时利用rw的再生，以及通过收缩图或使命中集更大来提高这种再生频率的策略。我们的第一个算法是基于强化学习(RL)的，它使用随机逼近来推导估计量。这种方法可以看作是介于纯随机马尔可夫链蒙特卡罗迭代和确定性相对值迭代之间的中间方法。第二种算法，我们称之为巡回比率(RT)估计器，是一种改进形式的受访者驱动抽样(RDS)，它适应了再生的思想。结果:我们通过对真实网络的仿真研究了该方法。我们观察到，rl估计器的轨迹比基于随机行走的标准估计程序更稳定，其误差性能与受访者驱动抽样(RDS)相当，RDS的渐近方差比许多其他估计器小。仿真研究还表明，随时间的推移，rt估计器的均方误差比RDS的衰减要快得多。结论:新开发的基于RW的估计器(RL-和rt -估计器)可以避免老化期，更好地控制沿样本路径的稳定性，总体上减少了估计时间。我们的估计器可以应用于社会网络和复杂网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Revisiting random walk based sampling in networks: evasion of burn-in period and frequent regenerations.

查看原文本刊更多论文

Revisiting random walk based sampling in networks: evasion of burn-in period and frequent regenerations.

Background: In the framework of network sampling, random walk (RW) based estimation techniques provide many pragmatic solutions while uncovering the unknown network as little as possible. Despite several theoretical advances in this area, RW based sampling techniques usually make a strong assumption that the samples are in stationary regime, and hence are impelled to leave out the samples collected during the burn-in period.

Methods: This work proposes two sampling schemes without burn-in time constraint to estimate the average of an arbitrary function defined on the network nodes, for example, the average age of users in a social network. The central idea of the algorithms lies in exploiting regeneration of RWs at revisits to an aggregated super-node or to a set of nodes, and in strategies to enhance the frequency of such regenerations either by contracting the graph or by making the hitting set larger. Our first algorithm, which is based on reinforcement learning (RL), uses stochastic approximation to derive an estimator. This method can be seen as intermediate between purely stochastic Markov chain Monte Carlo iterations and deterministic relative value iterations. The second algorithm, which we call the Ratio with Tours (RT)-estimator, is a modified form of respondent-driven sampling (RDS) that accommodates the idea of regeneration.

Results: We study the methods via simulations on real networks. We observe that the trajectories of RL-estimator are much more stable than those of standard random walk based estimation procedures, and its error performance is comparable to that of respondent-driven sampling (RDS) which has a smaller asymptotic variance than many other estimators. Simulation studies also show that the mean squared error of RT-estimator decays much faster than that of RDS with time.

Conclusion: The newly developed RW based estimators (RL- and RT-estimators) allow to avoid burn-in period, provide better control of stability along the sample path, and overall reduce the estimation time. Our estimators can be applied in social and complex networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Social Networks Mathematics-Modeling and Simulation

自引率

0.00%

发文量

审稿时长

13 weeks

期刊介绍： Computational Social Networks showcases refereed papers dealing with all mathematical, computational and applied aspects of social computing. The objective of this journal is to advance and promote the theoretical foundation, mathematical aspects, and applications of social computing. Submissions are welcome which focus on common principles, algorithms and tools that govern network structures/topologies, network functionalities, security and privacy, network behaviors, information diffusions and influence, social recommendation systems which are applicable to all types of social networks and social media. Topics include (but are not limited to) the following: -Social network design and architecture -Mathematical modeling and analysis -Real-world complex networks -Information retrieval in social contexts, political analysts -Network structure analysis -Network dynamics optimization -Complex network robustness and vulnerability -Information diffusion models and analysis -Security and privacy -Searching in complex networks -Efficient algorithms -Network behaviors -Trust and reputation -Social Influence -Social Recommendation -Social media analysis -Big data analysis on online social networks This journal publishes rigorously refereed papers dealing with all mathematical, computational and applied aspects of social computing. The journal also includes reviews of appropriate books as special issues on hot topics.