Not Enough Data?: Joint Inferring Multiple Diffusion Networks via Network Generation Priors

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI:10.1145/3018661.3018675

Xinran He, Yan Liu

{"title":"Not Enough Data?: Joint Inferring Multiple Diffusion Networks via Network Generation Priors","authors":"Xinran He, Yan Liu","doi":"10.1145/3018661.3018675","DOIUrl":null,"url":null,"abstract":"Network Inference, i.e., discovering latent diffusion networks from observed cascades, has been studied extensively in recent years, leading to a series of excellent work. However, it has been observed that the accuracy of existing methods deteriorates significantly when the number of cascades are limited (compared with the large number of nodes), which is the norm in real world applications. Meanwhile, we are able to collect cascades on many different topics or over a long time period: the associated influence networks (either topic-specific or time-specific) are highly correlated while the number of cascade observations associated with each network is very limited. In this work, we propose a generative model, referred to as the MultiCascades model (MCM), to address the challenge of data scarcity by exploring the commonality between multiple related diffusion networks. MCM builds a hierarchical graphical model, where all the diffusion networks share the same network prior, e.g., the popular Stochastic Blockmodels or the latent space models. The parameters of the network priors can be effectively learned by gleaning evidence from a large number of inferred networks. In return, each individual network can be inferred more accurately thanks to the prior information. Furthermore, we develop efficient inference and learning algorithms so that MCM is scalable for practical applications. The results on both synthetic datasets and real-world datasets demonstrate that MCM infers both topic-specific and time-varying diffusion networks more accurately.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3018675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Network Inference, i.e., discovering latent diffusion networks from observed cascades, has been studied extensively in recent years, leading to a series of excellent work. However, it has been observed that the accuracy of existing methods deteriorates significantly when the number of cascades are limited (compared with the large number of nodes), which is the norm in real world applications. Meanwhile, we are able to collect cascades on many different topics or over a long time period: the associated influence networks (either topic-specific or time-specific) are highly correlated while the number of cascade observations associated with each network is very limited. In this work, we propose a generative model, referred to as the MultiCascades model (MCM), to address the challenge of data scarcity by exploring the commonality between multiple related diffusion networks. MCM builds a hierarchical graphical model, where all the diffusion networks share the same network prior, e.g., the popular Stochastic Blockmodels or the latent space models. The parameters of the network priors can be effectively learned by gleaning evidence from a large number of inferred networks. In return, each individual network can be inferred more accurately thanks to the prior information. Furthermore, we develop efficient inference and learning algorithms so that MCM is scalable for practical applications. The results on both synthetic datasets and real-world datasets demonstrate that MCM infers both topic-specific and time-varying diffusion networks more accurately.

查看原文本刊更多论文

数据不足?:基于网络生成先验的多扩散网络联合推理

网络推理，即从观察到的级联中发现潜在的扩散网络，近年来得到了广泛的研究，并产生了一系列优秀的工作。然而，已经观察到，当级联数量有限时(与大量节点相比)，现有方法的准确性会显著下降，这在现实世界的应用中是常态。同时，我们能够在许多不同的主题上或在很长一段时间内收集级联:相关的影响网络(特定主题或特定时间)是高度相关的，而与每个网络相关的级联观测数量非常有限。在这项工作中，我们提出了一个生成模型，称为多级模型(MCM)，通过探索多个相关扩散网络之间的共性来解决数据稀缺的挑战。MCM建立了一个分层图形模型，其中所有的扩散网络共享相同的网络先验，例如流行的随机块模型或潜在空间模型。通过从大量的推断网络中收集证据，可以有效地学习网络先验的参数。反过来，由于先验信息，每个单独的网络可以更准确地推断出来。此外，我们开发了高效的推理和学习算法，使MCM在实际应用中具有可扩展性。在合成数据集和实际数据集上的结果表明，MCM可以更准确地推断特定主题和时变扩散网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量