{"title":"网络数据聚类的过分散建模","authors":"Lu Wang, D. Zhu, Ming Dong, Yan Li","doi":"10.1109/ICMLA.2017.0-180","DOIUrl":null,"url":null,"abstract":"Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"55 1","pages":"42-49"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Modeling Over-Dispersion for Network Data Clustering\",\"authors\":\"Lu Wang, D. Zhu, Ming Dong, Yan Li\",\"doi\":\"10.1109/ICMLA.2017.0-180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"55 1\",\"pages\":\"42-49\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.0-180\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.0-180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Modeling Over-Dispersion for Network Data Clustering
Over-dispersed network data mining has emerged as a central theme in data science, evident by a sharp increase in the volume of real-world network data with imbalanced clusters.While most of existing clustering methods are designed for discovering the number of clusters and class specific connectivity patterns, few methods are available to uncover the imbalanced clusters,commonly existing in network communities and image segments.In this paper, we propose a generalized probabilistic modeling framework,SizeConnectivity, to estimate over-dispersed cluster size distribution together with class specific connectivity patterns from network data.We performed extensive synthetic and real-world experiments on clustering social network data and image data for detecting network communities and image segments.Our results demonstrate a superior performance of our SizeConnectivity clustering method in recovering the hidden structure of network data via modeling over-dispersion.