网络数据聚类的过分散建模

Lu Wang, D. Zhu, Ming Dong, Yan Li
{"title":"网络数据聚类的过分散建模","authors":"Lu Wang, D. Zhu, Ming Dong, Yan Li","doi":"10.1109/ICMLA.2017.0-180","DOIUrl":null,"url":null,"abstract":"Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"55 1","pages":"42-49"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Modeling Over-Dispersion for Network Data Clustering\",\"authors\":\"Lu Wang, D. Zhu, Ming Dong, Yan Li\",\"doi\":\"10.1109/ICMLA.2017.0-180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"55 1\",\"pages\":\"42-49\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.0-180\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.0-180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

过度分散的网络数据挖掘已经成为数据科学的一个中心主题,这可以从具有不平衡集群的真实网络数据量的急剧增加中看出。虽然现有的聚类方法大多是为了发现簇的数量和类特定的连接模式而设计的,但很少有方法可以发现不平衡簇,这种不平衡簇通常存在于网络社区和图像段中。在本文中,我们提出了一个广义的概率建模框架,SizeConnectivity,以估计过度分散的簇大小分布以及来自网络数据的类特定连接模式。我们对聚类社交网络数据和图像数据进行了广泛的合成和现实世界的实验,以检测网络社区和图像片段。我们的结果表明,我们的SizeConnectivity聚类方法在通过建模过度分散来恢复网络数据的隐藏结构方面具有优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modeling Over-Dispersion for Network Data Clustering
Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信