{"title":"Fast Clustering of Hypergraphs Based on Bipartite-Edge Restoration and Node Reachability","authors":"Shuta Ito, Takayasu Fushimi","doi":"10.1145/3428757.3429095","DOIUrl":null,"url":null,"abstract":"In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"229 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3428757.3429095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.