Scalable Community Detection in the Degree-Corrected Stochastic Block Model

Yicong He, Andre Beckus, George K. Atia
{"title":"Scalable Community Detection in the Degree-Corrected Stochastic Block Model","authors":"Yicong He, Andre Beckus, George K. Atia","doi":"10.1109/mlsp52302.2021.9596377","DOIUrl":null,"url":null,"abstract":"Community detection aims to partition a connected graph into a small number of clusters. The Degree-Corrected Stochastic Block Model (DCSBM) is one popular generative model that yields graphs with varying degree distributions within the communities. However, large computational complexity and storage requirements of existing approaches for DCSBM limit their scalability to large graphs. In this paper, we advance a scalable framework for DCSBM, in which the full graph is first sub-sampled by selecting a small subset of the nodes, then a clustering of the induced subgraph is obtained, followed by low-complexity retrieval of the global community structure from the clustering of the graph sketch. To sample the underlying graph, we introduce a family of sampling schemes that capture local community structures using metrics derived from the average neighbor degrees, which are shown to achieve the twin objective of sampling from low-density clusters and identifying high-degree nodes within each cluster. The proposed approach can perform on par with full scale clustering while affording substantial complexity and storage gains as demonstrated through experiments using both synthetic and real data.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"120 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mlsp52302.2021.9596377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Community detection aims to partition a connected graph into a small number of clusters. The Degree-Corrected Stochastic Block Model (DCSBM) is one popular generative model that yields graphs with varying degree distributions within the communities. However, large computational complexity and storage requirements of existing approaches for DCSBM limit their scalability to large graphs. In this paper, we advance a scalable framework for DCSBM, in which the full graph is first sub-sampled by selecting a small subset of the nodes, then a clustering of the induced subgraph is obtained, followed by low-complexity retrieval of the global community structure from the clustering of the graph sketch. To sample the underlying graph, we introduce a family of sampling schemes that capture local community structures using metrics derived from the average neighbor degrees, which are shown to achieve the twin objective of sampling from low-density clusters and identifying high-degree nodes within each cluster. The proposed approach can perform on par with full scale clustering while affording substantial complexity and storage gains as demonstrated through experiments using both synthetic and real data.
度校正随机块模型中的可扩展社区检测
社区检测的目的是将连通图划分为少量的聚类。度校正随机块模型(DCSBM)是一种流行的生成模型,它可以生成具有不同度分布的图。然而,现有DCSBM方法的计算复杂性和存储需求限制了它们对大型图的可扩展性。本文提出了一种可扩展的DCSBM框架,该框架首先通过选择节点的一个小子集对完整图进行子采样,然后获得诱导子图的聚类,然后从图草图的聚类中检索低复杂度的全局群落结构。为了对底层图进行采样,我们引入了一系列采样方案,这些方案使用来自平均邻居度的度量来捕获本地社区结构,这些方案被证明可以实现从低密度集群中采样和在每个集群中识别高度节点的双重目标。通过使用合成数据和真实数据的实验证明,所提出的方法可以在提供大量复杂性和存储增益的同时,执行与全面集群相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信