Fast nonparametric inference of network backbones for graph sparsification

arXiv - PHYS - Physics and Society Pub Date : 2024-09-10 DOI:arxiv-2409.06417

Alec Kirkley

{"title":"Fast nonparametric inference of network backbones for graph sparsification","authors":"Alec Kirkley","doi":"arxiv-2409.06417","DOIUrl":null,"url":null,"abstract":"A network backbone provides a useful sparse representation of a weighted\nnetwork by keeping only its most important links, permitting a range of\ncomputational speedups and simplifying complex network visualizations. There\nare many possible criteria for a link to be considered important, and hence\nmany methods have been developed for the task of network backboning for graph\nsparsification. These methods can be classified as global or local in nature\ndepending on whether they evaluate the importance of an edge in the context of\nthe whole network or an individual node neighborhood. A key limitation of\nexisting network backboning methods is that they either artificially restrict\nthe topology of the backbone to take a specific form (e.g. a tree) or they\nrequire the specification of a free parameter (e.g. a significance level) that\ndetermines the number of edges to keep in the backbone. Here we develop a\ncompletely nonparametric framework for inferring the backbone of a weighted\nnetwork that overcomes these limitations by automatically selecting the optimal\nnumber of edges to retain in the backbone using the Minimum Description Length\n(MDL) principle from information theory. We develop two encoding schemes that\nserve as objective functions for global and local network backbones, as well as\nefficient optimization algorithms to identify the optimal backbones according\nto these objectives with runtime complexity log-linear in the number of edges.\nWe show that the proposed framework is generalizable to any discrete weight\ndistribution on the edges using a maximum a posteriori (MAP) estimation\nprocedure with an asymptotically equivalent Bayesian generative model of the\nbackbone. We compare the proposed method with existing methods in a range of\ntasks on real and synthetic networks.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06417","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A network backbone provides a useful sparse representation of a weighted network by keeping only its most important links, permitting a range of computational speedups and simplifying complex network visualizations. There are many possible criteria for a link to be considered important, and hence many methods have been developed for the task of network backboning for graph sparsification. These methods can be classified as global or local in nature depending on whether they evaluate the importance of an edge in the context of the whole network or an individual node neighborhood. A key limitation of existing network backboning methods is that they either artificially restrict the topology of the backbone to take a specific form (e.g. a tree) or they require the specification of a free parameter (e.g. a significance level) that determines the number of edges to keep in the backbone. Here we develop a completely nonparametric framework for inferring the backbone of a weighted network that overcomes these limitations by automatically selecting the optimal number of edges to retain in the backbone using the Minimum Description Length (MDL) principle from information theory. We develop two encoding schemes that serve as objective functions for global and local network backbones, as well as efficient optimization algorithms to identify the optimal backbones according to these objectives with runtime complexity log-linear in the number of edges. We show that the proposed framework is generalizable to any discrete weight distribution on the edges using a maximum a posteriori (MAP) estimation procedure with an asymptotically equivalent Bayesian generative model of the backbone. We compare the proposed method with existing methods in a range of tasks on real and synthetic networks.

查看原文本刊更多论文

针对图稀疏化的网络骨干的快速非参数推断

网络主干通过只保留最重要的链接，为加权网络提供了有用的稀疏表示，从而提高了一系列计算速度，并简化了复杂的网络可视化。认为链接重要的标准有很多种，因此人们开发了很多方法来完成网络骨干图解析任务。这些方法可分为全局性和局部性两种，具体取决于它们是在整个网络还是单个节点邻域的背景下评估边的重要性。现有网络骨干网方法的一个主要局限是，它们要么人为地限制骨干网的拓扑结构采用特定的形式（如树形），要么需要指定一个自由参数（如显著性水平）来决定骨干网中要保留的边的数量。在这里，我们开发了一个用于推断加权网络主干的完全非参数框架，它利用信息论中的最小描述长度（MDL）原理自动选择主干中要保留的最优边数，从而克服了这些限制。我们开发了两种编码方案，分别作为全局和局部网络骨干网的目标函数，以及一种高效的优化算法，用于根据这些目标确定最佳骨干网，其运行时间复杂度与边的数量成对数线性关系。我们的研究表明，利用最大后验（MAP）估计程序和渐近等效的骨干网贝叶斯生成模型，所提出的框架可以推广到边上的任何离散权重分布。我们在真实和合成网络的一系列任务中比较了所提出的方法和现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - PHYS - Physics and Society

自引率

0.00%

发文量