Lahari Anne, The-Anh Vu-Le, Minhyuk Park, Tandy Warnow, George Chacko
{"title":"Synthetic Networks That Preserve Edge Connectivity","authors":"Lahari Anne, The-Anh Vu-Le, Minhyuk Park, Tandy Warnow, George Chacko","doi":"arxiv-2408.13647","DOIUrl":null,"url":null,"abstract":"Since true communities within real-world networks are rarely known, synthetic\nnetworks with planted ground truths are valuable for evaluating the performance\nof community detection methods. Of the synthetic network generation tools\navailable, Stochastic Block Models (SBMs) produce networks with ground truth\nclusters that well approximate input parameters from real-world networks and\nclusterings. However, we show that SBMs can produce disconnected ground truth\nclusters, even when given parameters from clusterings where all clusters are\nconnected. Here we describe the REalistic Cluster Connectivity Simulator\n(RECCS), a technique that modifies an SBM synthetic network to improve the fit\nto a given clustered real-world network with respect to edge connectivity\nwithin clusters, while maintaining the good fit with respect to other network\nand cluster statistics. Using real-world networks up to 13.9 million nodes in\nsize, we show that RECCS, applied to stochastic block models, results in\nsynthetic networks that have a better fit to cluster edge connectivity than\nunmodified SBMs, while providing roughly the same quality fit for other network\nand clustering parameters as unmodified SBMs.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.13647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Since true communities within real-world networks are rarely known, synthetic
networks with planted ground truths are valuable for evaluating the performance
of community detection methods. Of the synthetic network generation tools
available, Stochastic Block Models (SBMs) produce networks with ground truth
clusters that well approximate input parameters from real-world networks and
clusterings. However, we show that SBMs can produce disconnected ground truth
clusters, even when given parameters from clusterings where all clusters are
connected. Here we describe the REalistic Cluster Connectivity Simulator
(RECCS), a technique that modifies an SBM synthetic network to improve the fit
to a given clustered real-world network with respect to edge connectivity
within clusters, while maintaining the good fit with respect to other network
and cluster statistics. Using real-world networks up to 13.9 million nodes in
size, we show that RECCS, applied to stochastic block models, results in
synthetic networks that have a better fit to cluster edge connectivity than
unmodified SBMs, while providing roughly the same quality fit for other network
and clustering parameters as unmodified SBMs.