Root and community inference on latent network growth processes using noisy attachment models

IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY
Harry Crane, Min Xu
{"title":"Root and community inference on latent network growth processes using noisy attachment models","authors":"Harry Crane, Min Xu","doi":"10.1093/jrsssb/qkad102","DOIUrl":null,"url":null,"abstract":"Abstract Many existing statistical models for networks overlook the fact that most real-world networks are formed through a growth process. To address this, we introduce the PAPER (Preferential Attachment Plus Erdős-Rényi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős-Rényi) (ER) random edges. The PA tree component captures the underlying growth/recruitment process of a network where vertices and edges are added sequentially, while the ER component can be regarded as random noise. Given only a single snapshot of the final network G, we study the problem of constructing confidence sets for the early history, in particular the root node, of the unobserved growth process; the root node can be patient zero in a disease infection network or the source of fake news in a social media network. We propose an inference algorithm based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the expected size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reecting the growth of multiple communities, and we use these models to provide a new approach to community detection.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"176 1","pages":"0"},"PeriodicalIF":3.1000,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series B-Statistical Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jrsssb/qkad102","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Many existing statistical models for networks overlook the fact that most real-world networks are formed through a growth process. To address this, we introduce the PAPER (Preferential Attachment Plus Erdős-Rényi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős-Rényi) (ER) random edges. The PA tree component captures the underlying growth/recruitment process of a network where vertices and edges are added sequentially, while the ER component can be regarded as random noise. Given only a single snapshot of the final network G, we study the problem of constructing confidence sets for the early history, in particular the root node, of the unobserved growth process; the root node can be patient zero in a disease infection network or the source of fake news in a social media network. We propose an inference algorithm based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the expected size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reecting the growth of multiple communities, and we use these models to provide a new approach to community detection.
基于噪声依恋模型的潜在网络生长过程的根和社区推断
许多现有的网络统计模型忽略了一个事实,即大多数现实世界的网络都是通过一个增长过程形成的。为了解决这个问题,我们引入了随机网络的PAPER(优先附件加Erdős-Rényi)模型,其中我们让随机网络G是优先附件(PA)树T和其他Erdős-Rényi) (ER)随机边的并集。PA树组件捕获了一个网络的底层生长/招募过程,其中顶点和边是顺序添加的,而ER组件可以被视为随机噪声。给定最终网络G的单个快照,我们研究了为未观察到的增长过程的早期历史,特别是根节点构建置信集的问题;根节点可以是疾病感染网络中的零号病人,也可以是社交媒体网络中的假新闻来源。我们提出了一种基于Gibbs抽样的推理算法,该算法可扩展到具有数百万个节点的网络,并提供理论分析表明,只要ER边的噪声水平不太大,置信集的期望大小就很小。我们还提出了模型的变体,其中多个生长过程同时发生,反映了多个群落的生长,并使用这些模型提供了一种新的群落检测方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.80
自引率
0.00%
发文量
83
审稿时长
>12 weeks
期刊介绍: Series B (Statistical Methodology) aims to publish high quality papers on the methodological aspects of statistics and data science more broadly. The objective of papers should be to contribute to the understanding of statistical methodology and/or to develop and improve statistical methods; any mathematical theory should be directed towards these aims. The kinds of contribution considered include descriptions of new methods of collecting or analysing data, with the underlying theory, an indication of the scope of application and preferably a real example. Also considered are comparisons, critical evaluations and new applications of existing methods, contributions to probability theory which have a clear practical bearing (including the formulation and analysis of stochastic models), statistical computation or simulation where original methodology is involved and original contributions to the foundations of statistical science. Reviews of methodological techniques are also considered. A paper, even if correct and well presented, is likely to be rejected if it only presents straightforward special cases of previously published work, if it is of mathematical interest only, if it is too long in relation to the importance of the new material that it contains or if it is dominated by computations or simulations of a routine nature.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信